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INTRODUCTION 

For  unknown  reasons,  prostate  cancer  incidence  and  mortality  rates  for  African 
American  males  are  among  the  highest  in  the  world.  Very  few  hereditary  prostate  cancer 
studies  have  included  African  Americans.  This  is  imfortunate;  especially  since  the 
population  history  Africans  Americans  is  quite  different  than  other  populations.  Thus, 
genetic  predisposition  to  a  common  disease  like  prostate  cancer  may  also  be  different. 
The  identification  of  susceptibility  genes  will  provide  insight  into  critical  rate  limiting 
steps  in  the  carcinogenic  pathway  of  both  inherited  and  sporadic  cases  of  the  disease.  The 
specific  goals  of  this  project  are  as  follows:  (1)  Extraction  of  genomic  DNA  from  blood 
collected  from  2000  unrelated  men  (1500  Afiican  Americans  and  500  European 
Americans)  from  Columbia,  South  Carolina;  Chicago,  Illinois;  and  Washington,  D.C.  (2) 
The  genotyping  of  microsatellite  (STRs)  loci  and  single  nucleotide  polymorphisms 
(SNPs)  in  order  to  construct  compound  haplotypes  from  three  candidate  genomic  regions. 
(3)  Analyze  the  effects  of  differences  between  haplotypes  on  the  vulnerability  to  prostate 
cancer  and  related  PSA  levels  using  cladistic  association  analysis  (Templeton  et  al., 

1987).  Our  expectations  for  this  project  are  to  determine  if  any  of  the  candidate  gene 
regions  from  a  large  sample  of  clinically  evaluated  and  unrelated  Afiican  American 
males  are  significantly  associated  to  prostate  cancer  and  related  physiological 
biomarkers. 

BODY 

The  specific  aims  as  listed  in  the  Statement  of  Work  are  as  follows: 

Taskl.  Start-up  phase  and  subject  recruitment  (Months  1-5). 

Recruit  and  hire  a  research  associate. 

Identify  and  recruit  subjects  into  study. 

Evaluate  clinical  status  of  subjects. 

Task  2.  Data  collection  (Months  3-20). 

-  Extraction  of  genomic  DNA  from  blood  samples. 

-  Genotyping  of  DNA  samples. 

Collection  of  Epidemiological  data. 

Task  3.  Interim  analyses  (Months  10-22). 

Infer  haplotypes  from  genotypic  data. 

-  Enter  genetic  data  and  epidemiological  data  into  database. 

-  Perform  preliminary  data  analysis. 

Task  4.  Final  analyses,  publications  and  presentations  (Months  18-24). 

Perform  data  and  statistical  analyses. 

-  Test  hypotheses. 

-  Manuscript  preparations. 
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TASKl: 

Dr.  Kittles  has  established  individual  collaborations  with  Srinivasan  Vijayakumar,  M.D., 
a  radiation  oncologist,  from  the  University  of  Chicago  and  Michael  Reese  Hospital  in 
Chicago;  Sally  Weinrich,  Ph.D.  at  University  of  South  Carolina,  Columbia,  SC;  and 
Chiledum  A.  Ahaghotu,  MD.,  a  urologist  at  Howard  University  Hospital.  Over  the  past 
year,  the  collaborators  have  been  quite  successful  in  recruiting  a  cohort  of  cases  and 
controls  from  the  African  American  and  European  American  communities  in  Columbia, 
SC,  Chicago  and  Washington,  DC.  The  table  below  details  the  numbers  recruited  thus 
far. 


Table  1. 


Population 

Prostate  cancer 
patients 

Age  matched 
controls 

Total 

African 

Americans 

510 

705 

1,215 

European 

Americans 

200 

300 

500 

TOTAL 

700 

1100 

1,715 

TASKS  2, 3  and  4: 

Genomic  DNA  has  been  extracted  from  all  blood  specimens  collected  using  a  slight 
variation  of  the  Puregene  DNA  extraction  protocol.  We  have  started  genotyping  the 
androgen  receptor  gene  trinucleotide  repeat  polymorphisms.  The  androgen  receptor  (AR) 
interacts  with  androgens  to  promote  cell  division  (normal  and  malignant)  in  the  prostate 
gland.  The  AR  binds  dihydrotestosterone  and  stimulates  the  transcription  of  a  cascade  of 
androgen  responsive  genes.  Because  of  this  relationship,  it  has  been  proposed  by  many 
that  the  AR  may  be  one  genetic  predictor  of  susceptibility  to  prostate  cancer.  There  are 
hvo  polymorphic  regions  in  the  N-terminal  protein  domain  of  the  AR,  which  are  encoded 
in  the  first  exon  of  the  AR  gene.  These  are  the  polyglutamine  repeat  region  (CAG)n  and 
the  polyglycine  repeat  region  (GGC)n  (Stanford  et  al,  1997;  Irvine  et  al.,  1995; 
Giovannucci  et  al,  1997;  Edwards  et  al.,  1992).  We  genotyped  the  CAG  and  GGC  loci 
for  approximately  950  individuals  (1063  chromosomes)  using  florescent-dye  labeled 
PCR  primers  and  the  ABI 377  DNA  sequencer.  Control  populations  we  have  examined 
thus  far  for  the  CAG  and  GGC  markers  include  Afncan  Americans  (N=520),  Gold  Coast 
Africans  (Nigeria  and  Ghana,  N=85),  Sierra  Leoneans  (N=210),  European  Americans 
(N=85),  Amerindians  (N=103),  and  Asians  (Chinese,  N=60). 

A  total  of  27  CAG  (range  5-31  repeats)  and  23  GGC  alleles  (range:  2-24  repeats)  were 
observed.  Not  surprisingly,  African  Americans  had  the  most  alleles.  The  European  and 
Asian  populations  possessed  the  least  number  of  alleles  at  the  two  loci.  Populations  of 
Afncan  descent  possessed  significantly  shorter  repeats  than  non-Afiican  populations 
(paired  t-test,  p<0. 00001).  The  entire  range  of  CAG  repeat  variation  was  observed  among 
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populations  of  African  descent.  Interestingly,  the  purported  high-risk  CAG  repeat  lengths 
<20,  were  most  prevalent  among  populations  of  African  descent.  The  non-random 
association  of  CAG  and  GGC  alleles,  linkap  disequilibrium  (LD),  was  assessed  for  each 
study  population.  Significant  evidence  for  linkage  disequilibrium  between  the  two 
markers  was  observed  only  among  Afiican  Americans  cases  and  controls  (p=0.00001) 
and  Amerindians  (p=0.009).  The  LD  observed  in  Amerindians  was  consistent  with  their 
population  history  of  recent  population  bottlenecks.  The  high  level  of  linkage 
disequilibrium  among  Afiican  Americans  is  likely  due  to  admixture.  This  assessment  of 
linkage  disequilibrium  in  the  African  American  population  is  quite  significant  for  several 
reasons.  First,  the  high  level  of  stratification  in  the  African  American  population  may  be 
a  confounder  in  disease  association  studies  if  the  substructure  is  not  controlled  for. 
Secondly,  the  identification  of  high-risk  haplotypes  is  potentially  more  powerful  in 
disease  studies  than  single  locus  analyses.  A  preliminary  analysis  of  androgen  receptor 
haplotype  risk  and  prostate  cancer  has  revealed  an  association  of  closely  related 
haplotypes  with  high-grade  cancer.  We  intend  to  increase  the  resolution  in  identification 
of  these  possible  high-risk  haplotypes  by  typing  single  nucleotide  polymorphisms  (SNPs) 
within  the  gene.  This  work  was  published  in  Human  Genetics  in  late  2001 . 

Another  gene  we  have  studied  is  the  human  steroid  5a-reductase  type  2  gene  (SRD5A2) 
located  on  chromosome  2.  SRD5A2  encodes  the  isoenzyme  5a-reductase,  which  is 
responsible  for  the  intracellular  conversion  of  testosterone  to  its  reduced  form, 
dihydro  testosterone  (DHT).  DHT  promotes  prostate  cell  division  and  may  be  involved  in 
benign  and  neoplastic  growth  of  the  prostate  in  elderly  men  (Labrie  et  al.,  1993).  It  has 
also  been  suggested  that  differences  in  androgen  synthesis  and  metabolism  may  be 
responsible  for  ethnic  variation  in  prostate  cancer  risk  (Ross  et  al,  1992).  Thus  genetic 
variability  of  the  SRD5A2  gene  and  subsequent  enzyme  activity  may  be  important  risk 
factors  in  prostate  cancer.  A  dinucleotide  repeat  (TA)  marker  has  been  observed  in  exon 
5  of  the  gene.  Preliminary  studies  have  shown  that,  like  the  androgen  receptor  CAG  and 
GGC  repeat  loci,  allelic  distributions  of  this  polymorphic  marker  vary  considerably 
between  high-risk  and  low-risk  populations  (Reichardt  et  al,  1995).  Similarly  to  the 
androgen  receptor,  the  TA-repeat  and  a  SNP  which  creates  the  loss  of  an  Rsal  restriction 
site  within  the  SRD5A2  gene  has  been  typed  for  all  the  samples  collected  thus  far.  In 
addition  we  have  started  screening  the  entire  gene  for  SNPs  using  a  core  set  of  DNA 
samples  from  our  cohort.  Exon  1  has  been  screened  and  sequenced  for  about  60  samples 
of  men  with  prostate  cancer,  10  Afiican  control  samples,  and  25  Asians.  We  are  currently 
characterizing  a  SNP  in  this  exon,  which  contributes  to  a  loss  of  a  BstUI  site.  We 
genotyped  this  SNP  using  florescent  labeled  primers  and  the  ABI  377  sequencer  along 
with  two  other  markers,  the  (TA)  repeat  located  in  exon  five,  and  an  Rsal  RFLP  in  order 
to  create  haplotypes.  Haplotjpe  analyses  (cladistic  analyses)  revealed  no  association  with 
prostate  cancer  or  related  clinical  phenotypes. 

We  also  examined  another  genetic  region  that  has  been  shown  to  play  a  role  in  hereditary 
forms  of  prostate  cancer.  This  region  is  on  chromosome  X.  Evidence  for  a  prostate  cancer 
susceptibility  locus  on  the  X  chromosome  has  been  observed  using  linkage  analysis  on 
certain  families  by  NHGRI  investigators  (Xu  et  al,  1998).  The  region  implicated,  Xq27- 
28  is  not  near  the  androgen  receptor.  In  fact  more  than  50cM  separates  the  suspected 
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locus  from  the  androgen  receptor.  Five  microsatellites  on  chromosome  X  near  the  Xq27- 
28  region  were  genotyped  using  florescent-labeled  primers  and  the  ABI 377  sequencer. 
The  microsatellites  included  DXS8106,  DXS984,  DXSl  193,  DXS1205,  and  DXS1227. 
Allele  and  haplotype  analyses  of  this  region  on  chromosome  X  did  not  reveal  any 
correlations  with  prostate  cancer  in  our  populations. 

Single  nucleotide  polymorphisms  (SNPs)  are  useful  genetic  markers  to  investigate 
susceptible  genes  to  diseases  and  drug  responsiveness.  In  order  to  efficiently  genotype 
SNPs  within  the  candidate  genes  of  interest  our  laboratory  utilized  a  novel  methodology 
called  Pyrosequencing.  The  method  involves  immobilization  of  amplified,  biotinylated 
DNA  sequence  products  which  contain  a  sequence  variant  onto  sepharose  beads.  Primer 
extension  reactions  are  carried  out  by  stepwise  elongation  of  the  sequencing  primer 
strand,  upon  sequential  addition  of  different  deoxynucleoside  triphosphates  and 
simultaneous  degradation  of  unincorporated  nucleotides  by  the  enzyme  apyrase.  As  the 
mini  sequencing  reaction  continues,  the  complementary  DNA  strand  extends  and  the 
DNA  sequence  is  determined  from  the  single  peaks  in  the  pyrogram.  The  Pyrosequencing 
system  is  an  automated  DNA  sequence  analysis  machine  (PSQ  96)  that  uses  96-well 
microtiter  plates,  reagents  for  SNP  detection  and  the  accompanying  software  creates  a 
sequence  database  of  the  SNP  and  also  predicts  genotype  results  with  quality  assessment 
of  individual  samples.  We  recently  published  a  paper  on  this  methodology  in  the  journal 
American  International  Biotechnology  Laboratory. 


Cladistic  association  analyses  of  androgen  receptor  haplotypes 
In  addition  to  the  two  microsatellites  (CAG  and  GGN)  within  exon  1  of  the  androgen 
receptor  gene  there  exists  a  single  nucleotide  polymorphism  (G633A).  The  SNP  is  a 
synonymous  substitution  that  does  not  appear  to  alter  protein  function.  However  since  it 
is  polymorphic  it  provides  information  to  differentiate  haplotypes  in  order  to  find 
possible  susceptibility  haplotypes  for  disease.  In  this  study  we  evaluated  if  increased  risk 
for  prostate  cancer  and  associated  clinical  characteristics  existed  among  variants  of  the 
androgen  receptor  gene  (haplotypes)  using  the  evolutionary  history  of  the  haplotypes. 
Prostate  cancer  patients  were  40-100  years  old  and  histologically  diagnosed  within  the 
last  2  years.  Gleason  grade  and  tumor  stage  characteristics  were  combined  to  define  low 
and  high  stage/grade.  Healthy  controls  had  normal  DREs  and  PSA  <4.0  ng/ml.  Afiican 
Americans  (107  Pea  patients  and  165  healthy  volunteers)  were  recruited  fi'om  Howard 
University  Hospital,  Washington,  DC.  Nigerians  (65  prostate  cancer  patients  and  48 
healthy  controls  all  belonging  to  the  Yoruba  ethnic  group)  were  recruited  at  University 
College  Hospital  in  Ibadan  and  Central  Hospital  in  Benin  City,  Nigeria.  European 
Americans  (121  Pea  patients)  were  recruited  at  Michael  Reese  Hospital  Chicago,  IL.  The 
two  tables  below  provide  clinical  and  genetic  information  on  the  populations. 
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Table  2. 


Population 

(N) 

%  affected 

age  range 
(mean) 

grade/stage 

PSA 

(mean) 

Nigerians 

(113) 

58% 

37-96 

(62) 

87%  high 

0.1-28 

(1.68) 

Afncan 

Americans 

(272) 

39% 

37-88 

(59) 

43%  high 

0.1-253 

(8.17) 

European 

Americans 

(121) 

100% 

61-77 

(62) 

100%  low 

0.0-  45 
(7.32) 

Table  3. 

Population 

CAG 

GGC 

A-allele 

#  of  haplotypes 

(N) 

(mean) 

(mean) 

Nigerians 

3-23 

5-18 

66% 

59  distinct 

(113) 

(13) 

(13) 

African 

Americans 

6-27 

5-22 

57% 

102  distinct 

(272) 

(14) 

(13) 

European 

Americans 

6-22 

8-19 

16% 

58  distinct 

(121) 

(15) 

(14) 

Contingency  table  analyses  revealed  a  strong  association  with  specific  androgen  receptor 
haplotypes  and  prostate  cancer. 

1 .  Strong  association  of  AR  haplotypes  with  prostate  cancer  in  African  Americans  (p  = 
0.0006  and  Nigerians  (p  =  0.005). 

2.  Nominal  association  with  grade/stage  in  African  Americans  (p  =  0.05) 

Two-way  ANOVA  analyses  also  reveal  an  association  of  AR  haplotypes  with  PSA  levels 
in  Nigerians  (p  =  0.001).  A  second  round  of  statistical  analyses  is  being  performed  in 
order  to  confirm  our  findings. 
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OTHER  CANDIDATE  GENES  EXAMINED: 


It  is  well  known  that  androgens  play  an  important  role  in  the  etiology  of  prostate  cancer. 
The  CYP17  gene  encodes  the  cytochrome  P450cl7a  enzyme  which  is  the  rate-limiting 
enzyme  in  androgen  biosynthesis.  A  T  to  C  polymorphism  in  the  5’  promoter  region  has 
recently  been  associated  with  prostate  cancer.  However  contradictory  data  exists 
concerning  the  risk  allele.  To  further  investigate  the  involvement  of  the  CYP17  variant 
with  prostate  cancer  we  typed  the  polymorphism  in  three  different  populations  and 
evaluated  its  association  with  prostate  cancer  and  clinical  presentation  in  African 
Americans.  We  genotyped  the  CYP17  polymorphism  in  Nigerian  (n  =  56),  European 
American  (n  =  74),  and  African  American  (n  =  1 1 1)  healthy  male  volunteers  along  with 
African  American  men  affected  with  prostate  cancer  (n  =  71),  using  Pyrosequencing. 
Genotype  and  allele  frequencies  did  not  differ  significantly  across  the  different  control 
populations.  African  American  men  with  the  CC  CYP17  genotype  had  an  increased  risk 
of  prostate  cancer  [odds  ratio  (OR),  2.8;  95%  confidence  interval  (Cl)  =  1. 0-7.4] 
compared  to  those  with  the  TT  genotype.  A  similar  trend  was  observed  between  the 
homozygous  variant  genotype  in  African  American  prostate  cancer  patients  and  clinical 
presentation.  The  CC  genotype  was  significantly  associated  with  higher  grade  and  stage 
of  prostate  cancer  (OR,  7.1;  95%  Cl,  1.4-36.1).  The  risk  did  not  differ  significantly  by 
family  history  or  age.  Our  results  suggested  that  the  C  allele  of  the  CYP17  polymorphism 
is  significantly  associated  with  increased  prostate  cancer  risk  and  clinically  advanced 
disease  in  African  Americans.  This  work  was  published  in  Cancer  Epidemiology 
Biomarkers  and  Prevention  in  2001. 

An  A/G  SNP  within  the  promoter  of  the  CYP3  A4  gene  has  previously  been  associated 
with  prostate  cancer  in  African  Americans.  However,  the  SW  exhibits  large  differences 
in  allele  frequency  between  populations.  Given  that  the  African  American  population  is 
genetically  heterogeneous  because  of  its  African  ancestry  and  subsequent  admixture  with 
European  Americans,  case-control  studies  using  African  Americans  are  highly 
susceptible  to  spurious  associations.  To  test  for  association  with  prostate  cancer  we 
genotyped  CYP3A4-V  in  prostate  cancer  patients  and  age  and  ethnicity  matched  controls 
representing  African  Americans,  Nigerians,  and  European  Americans.  To  detect 
population  stratification  among  the  African  American  samples  10  unlinked  genetic 
markers  were  genotyped.  To  correct  for  the  stratification,  as  proposed  by  Reich  and 
Goldstein  (2001),  the  uncorrected  association  statistic  was  divided  by  the  average  of 
association  statistics  across  the  10  unlinked  markers.  Sharp  differences  in  CYP3A4-V 
frequencies  were  observed  between  Nigerian  and  European  American  controls  (0.87  and 
0.10,  respectively;  P<0.0001).  African  Americans  were  intermediate  at  0.66.  An 
association  uncorrected  for  stratification  was  observed  between  CYP3A4-V  and  prostate 
cancer  in  African  Americans  (p=0.007).  An  association  was  also  observed  among 
European  Americans  (p=0.02)  but  not  Nigerians.  In  addition,  the  unlinked  genetic  marker 
test  provided  strong  evidence  of  population  stratification  among  African  Americans.  Due 
to  the  high  level  of  stratification,  the  corrected  P-value  was  not  significant  (p=0.25). 
Follow-up  studies  on  a  larger  dataset  will  be  needed  to  confirm  if  the  association  is 
indeed  spurious,  however  these  results  reveal  the  potential  for  confounding  of  association 
studies  using  African  Americans  and  the  need  for  study  designs  that  take  into  account 
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substructure  due  to  differences  in  ancestral  proportions  between  cases  and  controls.  This 
work  was  recently  published  (2002)  in  Human  Genetics. 

KEY  RESEARCH  ACCOMPLISHMENTS 

•  Recruitment  of  1800  clinically  evaluated  prostate  cancer  patients  and  healthy 
volunteers. 

•  Collected  over  6,500  genotypes  from  3  markers  within  the  androgen  receptor 

gene. 

•  Collected  over  5,200  genotypes  from  the  steroid  5a-reductase  type  2  gene. 

•  Genotyped  5  microsatellites  within  and  around  the  chromosome  Xq27-28 

region. 

•  Developed  new  genotyping  methodology  for  high-throughput  analyses. 

•  Published  seven  manuscripts  related  to  the  research. 
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CONCLUSIONS 

The  African  American  population  possesses  unique  genetic  features  due  to  its  history  of 
antiquity  and  admixture.  When  a  disease  such  as  prostate  cancer  manifests  variation  in 
incidence  and  mortality  betvi^een  populations,  admixed  populations  provide  a  population 
based  approach  to  evaluate  the  relative  importance  of  genetic  factors.  The  genetic 
resources  generated  by  this  project  are  directed  towards  this  end  and  enabled  us  to  utilize 
genomic  technologies  to  characterize  the  functional  implications  of  DNA  variation  in 
these  populations.  Since  this  study  takes  advantage  of  the  genetics  of  unrelated  men  from 
diverse  ethnic  populations,  the  results  may  be  generalizable  to  the  larger  American 
population.  The  assessment  and  publication  of  genetic  variation  within  the  candidate 
genes  for  prostate  cancer  is  quite  significant  because  it  (1)  provides  accessibility  to  the 
data  and  allow  others  to  compare  their  data  on  other  populations  for  the  same  markers; 
(2)  encourages  others  to  study  the  same  markers  in  other  populations  so  that  their 
populations  can  be  placed  into  a  global  framework;  and  (3)  stimulates  researchers  to 
develop  new  models  and  methods  to  analyze  the  data. 
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Abstract  While  studies  have  implicated  alleles  at  the 
CAG  and  GGC  trinucleotide  repeats  of  the  androgen  re¬ 
ceptor  gene  with  high-grade,  aggressive  prostate  cancer 
disease,  little  is  known  about  the  normal  range  of  varia¬ 
tion  for  these  two  loci,  which  are  separated  by  about 
LI  kb.  More  importantly,  few  data  exist  on  the  extent  of 
linkage  disequilibrium  (LD)  between  the  two  loci  in  dif¬ 
ferent  human  populations.  Here  we  present  data  on  CAG 
and  GGC  allelic  variation  and  LD  in  six  diverse  popula¬ 
tions.  Alleles  at  the  CAG  and  GGC  repeat  loci  of  the  an¬ 
drogen  receptor  were  typed  in  over  1000  chromosomes 
from  Africa,  Asia,  and  North  America.  Levels  of  linkage 
disequilibrium  between  the  two  loci  were  compared  be¬ 
tween  populations.  Haplotype  variation  and  diversity 
were  estimated  for  each  population.  Our  results  reveal 
that  populations  of  African  descent  possess  significantly 
shorter  alleles  for  the  two  loci  than  non-African  popula¬ 
tions  (P<0.0001).  Allelic  diversity  for  both  markers  was 
higher  among  African  Americans  than  any  other  popula¬ 
tion,  including  indigenous  Africans  from  Sierra  Leone 
and  Nigeria.  Analysis  of  molecular  variance  revealed  that 
approx.  20%  of  CAG  and  GGC  repeat  variance  could  be 
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attributed  to  differences  between  the  populations.  All  non- 
African  populations  possessed  the  same  common  haplo¬ 
type  while  the  three  populations  of  African  descent  pos¬ 
sessed  three  divergent  common  haplotypes.  Significant 
LD  was  observed  in  our  sample  of  healthy  African  Amer¬ 
icans.  The  LD  observed  in  the  African  American  popula¬ 
tion  may  be  due  to  several  reasons;  recent  migration  of 
African  Americans  from  diverse  rural  communities  fol¬ 
lowing  urbanization,  recurrent  gene  flow  from  diverse 
West  African  populations,  and  admixture  with  European 
Americans.  This  study  represents  the  largest  genotyping 
effort  to  be  performed  on  the  two  androgen  receptor  trin¬ 
ucleotide  repeat  loci  in  diverse  human  populations. 


Introduction 

The  human  androgen  receptor  (AR)  is  a  ligand-dependent 
nuclear  transcriptional  factor  that  regulates  the  expression 
of  genes  necessary  for  the  growth  and  development  of 
both  normal  and  malignant  prostate  tissue.  The  AR  gene 
is  about  90  kb  and  is  located  on  chromosome  Xqll— 12. 
Exon  1  of  the  gene  encodes  the  N-terminal  domain,  which 
controls  transcriptional  activation  of  the  receptor.  Exon  1 
also  encodes  two  polymorphic  trinucleotide  repeats  (CAG 
and  GGC),  which  code  for  poly  glutamine  and  poly  glycine 
tracts,  respectively  in  the  N-terminal  domain.  In  vitro  stud¬ 
ies  have  demonstrated  an  inverse  relationship  between 
CAG  repeat  length  and  AR  transcriptional  activation  abil¬ 
ity  (Chamberlain  et  al,  1999), 

Variations  in  AR  CAG  repeat  length  have  been  associ¬ 
ated  with  a  number  of  genetic  diseases.  Spinal  ataxia  1 
(SCAl),  Kennedy’s  disease,  and  Huntington’s  disease  are 
examples  of  AR  loss  of  function  disorders  that  result  from 
expansion  in  AR  CAG  repeat  length  (LaSpada  et  al.  1991; 
Orr  et  al.  1993).  In  addition,  short  CAG  and  GGC  repeat 
lengths  have  been  widely  attributed  to  increased  risk  of 
developing  prostate  cancer  (Giovannucci  et  al.  1997; 
Hardy  et  al.  1996;  Platz  et  al.  1998),  More  specifically,  in¬ 
dividuals  with  CAG  repeat  lengths  less  than  20  and  GGC 
repeats  less  than  16  have  been  associated  with  increased 
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risk  of  developing  prostate  cancer  (Giovannucci  et  al. 
1997;  Platz  et  al.  1998;  Stanford  et  al.  1997).  Striking  dif¬ 
ferences  in  CAG  repeat  lengths  have  been  observed  be¬ 
tween  populations.  Black  men  tend  to  have  significantly 
shorter  repeats  than  their  white  counterparts  (Edwards  et 
al.  1992;  Irvine  et  al.  1995;  Sartor  et  al,  1999).  These  ge¬ 
netic  differences  may  be  potentially  important  in  under¬ 
standing  why  populations  of  African  descent  are  more 
susceptible  to  developing  prostate  cancer.  African  Ameri¬ 
can  men  have  the  highest  incident  rate  of  prostate  cancer 
of  any  ethnic  group  in  the  United  States  (Brawley  and 
Kramer  1996).  Increasing  evidence  suggests  that  prostate 
cancer  is  more  prevalent  in  populations  of  African  descent 
(Glover  et  al.  1998;  Ogunbiyi  and  Shittu  1999;  Osegbe 
1997).  However,  attempts  to  explain  the  disparity  in  risk 
between  populations  are  limited.  Although  diet  (i.e.,  fat 
intake)  may  help  to  explain  the  high  prevalence  of  pros¬ 
tate  cancer  among  African  Americans,  such  an  influence 
may  be  limited  when  considering  other  populations  of 
African  descent  (i.e.,  Caribbeans  and  West  Africans)  whose 
diet  differ  considerably. 

Genetic  studies  on  the  AR  CAG  and  GGC  loci  have  fo¬ 
cused  mainly  on  European  American  prostate  cancer  pa¬ 
tients  and  controls.  Little  is  known  about  AR  haplotypic 
variation,  especially  among  different  human  populations. 
Evaluating  variation  in  AR  trinucleotide  repeat  lengths 
across  human  populations  may  provide  a  better  under¬ 
standing  of  the  ethnic  disparity  associated  with  prostate 
cancer.  Studies  have  not  formally  evaluated  variation  and 
the  extent  of  linkage  disequilibrium  between  the  two  tri¬ 
nucleotide  repeat  loci  across  human  populations  and  in 
particular  among  those  that  may  have  contributed  to  the 
African  American  gene  pool.  This  would  be  an  important 
prerequisite  to  determining  if  there  are  subpopulations  of 
disease  chromosomes  segregating  in  high-risk  groups 
such  as  African  Americans. 

When  the  occurrence  of  pairs  of  specific  alleles  at  dif¬ 
ferent  loci  on  the  same  haplotype  is  not  independent,  the 
deviation  from  the  independence  is  termed  linkage  dis¬ 
equilibrium  (LD).  LD  is  a  population  genetic  phenomenon 
that  has  been  useful  for  gene  mapping  efforts.  It  is  usually 
found  in  populations  for  genetic  markers  that  are  tightly 
(close  genetic  distance)  linked  and  can  be  generated  by 
mutation,  selection,  or  admixture  of  populations  with  dif¬ 
ferent  allele  frequencies.  Generally  disequilibrium  is  de¬ 
pendent  on  population  size,  time  (generations),  and  dis¬ 
tance  between  genetic  markers.  Normally,  the  greater  the 
distance  between  markers,  the  faster  the  decay  of  disequi¬ 
librium.  However,  for  highly  polymorphic  markers  such 
as  microsatellites,  the  high  mutation  rate  contributes  sig¬ 
nificantly  to  randomizing  associations  of  alleles. 

The  aim  of  this  study  was  to  formally  evaluate  varia¬ 
tion  and  the  extent  of  linkage  disequilibrium  between  the 
AR  gene  CAG  and  GGC  repeat  loci  in  human  popula¬ 
tions,  particularly  those  of  African  descent  such  as  African 
Americans. 

The  African  American  population  is  genetically  and 
culturally  heterogeneous  due  to  their  unique  history  in  the 
United  States  (Jackson  1993),  While  a  significant  portion 


of  the  African  American  gene  pool  originates  from  West¬ 
ern  and  Central  Africa,  other  populations  have  also  con¬ 
tributed  to  the  present  genetic  makeup  of  the  population. 
To  better  understand  variation  within  the  African  Ameri¬ 
can  population  we  included  comparative  populations  rep¬ 
resenting  West  Africans  (Nigeria  and  Sierra  Leone),  Euro¬ 
pean  Americans,  Chinese,  and  Amerindians. 


Subjects  and  methods 

Unrelated  African  American  men  (/?=520)  were  recruited  from  Co¬ 
lumbia,  South  Carolina,  for  prostate-specific  antigen  (PSA) 
screening  over  the  past  5  years.  Nigerians  (//=85)  representing  the 
Edo  (Bini)  ethnic  group  were  recruited  in  the  Udo  community  near 
Benin  City,  Nigeria.  European  American  men  (/;=90)  were  re¬ 
cruited  from  the  Washington,  DC  area.  The  African  American, 
Nigerian,  and  European  American  men  were  recruited  as  healthy 
community-based  controls  for  prostate  cancer  studies.  Inclusion 
criteria  were  men  between  50  and  80  years  of  age  with  PSA  levels 
less  than  4.0  ng/ml  and  normal  digital  rectal  examinations.  In  ad¬ 
dition,  unrelated  men  representing  the  Mendc  ethnic  group  from 
Sierra  Leone  (/?=240),  Han  Chinese  from  Taiwan  (/?=60),  and  an 
Amerindian  population  from  a  community  in  the  southwestern 
United  States  (/7=103)  were  also  included.  No  clinical  data  or  med¬ 
ical  history  was  collected  for  the  Sierra  Leone,  Chinese,  and 
Amerindian  participants.  Informed  consent  for  genetic  analysis 
was  obtained  for  all  subjects.  Individuals  of  mixed  ancestry  were 
not  excluded.  Genomic  DNA  was  isolated  from  whole-blood  sam¬ 
ples  using  the  Puregene  (Centra  Biosystems)  DNA  isolation  kit. 
The  trinucleotide  repeat  CAG  and  GGC  loci  were  amplified  by 
PCR  using  50  ng  genomic  DNA.  Primers  used  to  amplify  the  CAG 
locus  were  5'-TCC  AGA  ATC  TGT  TCC  AGA  GCG  TG-3'  (for¬ 
ward)  and  5'-GCT  GTG  A  AG  GTT  GCT  GTT  CCT  CAT-3'  (re¬ 
verse).  Primers  specific  for  the  GGC  locus  were  5'-CCA  GAG 
TCG  CTC  GCG  ACT  ACT  ACA  ACT  TTC  C-3'  (forward)  and 
5'-GGA  CTG  GGA  TAG  GGC  ACT  CTG  CTC  ACC-3'  (reverse). 
Florescent  dyes  6-FAM  and  HEX  were  used  to  label  the  forward 
primers  for  GGC  and  CAG  respectively.  PCR  cycling  conditions 
for  the  CAG  locus  were  35  cycles  of  95°C  for  30  s,  60°C  for  30  s, 
and  72°C  for  30  s.  Conditions  for  the  GGC  locus  was  25  cycles  of 
97°C  for  30  s,  55°C  for  30  s  and  72°C  for  I  min. 

PCR  products  for  both  loci  were  then  pooled  and  elec- 
trophoresed  on  an  ABI  377  DNA  sequencer,  (ABI,  Foster  City, 
Calif.,  USA).  Genescan  and  Genotyper  5.0  programs  (ABI)  were 
used  to  generate  fragment  sizes  and  genotypes.  Due  to  limited  ge¬ 
nomic  DNA  some  samples  could  not  be  typed  for  both  loci.  Statis¬ 
tical  analyses  for  comparison  of  repeat  length  mean,  mode,  and 
variance  among  populations  were  performed  using  Origin  5.0  (Mi- 
crocal  Software,  Northampton,  Mass.,  USA).  Heterozygosities  for 
the  two  trinucleotide  repeats  and  for  the  haplotypes  were  computed 
as  w(l-Zp7)/(/7-l),  where  p,  represents  the  frequency  of  the  /th  al¬ 
lele  or  haplotype,  and  where  n  is  the  number  of  chromosomes 
drawn  from  the  population.  Standard  errors  were  obtained  by  using 
equation  8.7  in  Nei  (1987).  Standardized  pairwise  linkage  disequi¬ 
librium  values  (D';  Lewontin  1964)  were  calculated  for  all  pairs  of 
microsatellite  alleles  observed  within  each  population.  The  null 
hypothesis  of  linkage  equilibrium  (D'=0)  was  tested  and  P  values 
obtained  by  Fisher’s  exact  test  using  the  Markov  chain  (Guo  and 
Thompson  1992)  implemented  by  the  computer  program  Arlequin 
1.1  (Schneider  et  al.  1997). 

Differences  among  populations  were  assessed  by  use  of  the 
hierarchical  analysis  of  molecular  haplotype  variance  (AMOVA; 
Excoffier  et  al.  1992;  Michalakis  and  Excoffier  1996)  imple¬ 
mented  by  the  Arlequin  1.1  package.  AMOVA  performs  a  hierar¬ 
chic  analysis  of  three  genetic-variance  components:  OST,  subpop¬ 
ulations  relative  to  the  total  population;  d>SC,  subpopulations  rela¬ 
tive  to  continental  groups;  and  <I>CT,  continental  groups  relative  to 
the  total  population.  For  the  analysis,  three  groups  containing  the 
six  populations  were  defined:  (a)  populations  of  African  descent 
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(Nigerians,  Sierra  Leoneans,  and  African  Americans);  (b)  Euro¬ 
pean  American  population;  and  (c)  Asian  descent  populations 
(Chinese  and  Amerindian).  The  AMOVA  assumed  a  single  step¬ 
wise  mutation  model  (DiRienzo  et  al.  1994;  Valdes  et  al.  1993)  for 
the  trinucleotide  repeat  loci.  In  addition,  pairwise  genetic  distances 
between  populations  were  computed  from  <I>ST  values;  D  =  OST/ 
(1-OST)  (Slatkin  1995).  Significance  levels  of  the  genetic  vari¬ 
ance  components  were  estimated  by  use  of  10,000  random-permu¬ 
tation  procedures. 


Results 

Allelic  diversity 

Tables  1  and  2  show  the  allelic  diversity  observed  in  the 
six  populations  for  the  CAG  and  GGC  markers.  African 
Americans  possessed  the  greatest  number  of  alleles  for 
both  markers,  which  partially  may  be  due  to  the  larger 
sample  size.  However,  for  the  CAG  locus  18  alleles  were 
observed  among  the  significantly  smaller  sample  of  Nige¬ 
rians.  The  African  American  population  possessed  the 
highest  gene  diversity  for  the  CAG  marker  of  any  of  the 
other  populations,  while  greater  diversity  was  observed 
for  the  GGC  locus  in  the  Sierra  Leone  population.  The 
number  of  CAG  alleles  observed  ranged  from  1 1  for  Eu- 
roamericans  to  21  for  African  Americans  (Table  1).  For 
the  GGC  locus  the  number  of  alleles  ranged  from  4  for 
both  Asians  and  Amerindians  to  17  for  African  Ameri¬ 
cans  (Table  2).  The  GGC  allele  with  15  repeats  was  highly 
frequent  in  non- African  populations,  ranging  from  55%  to 
80%.  The  15-repeat  allele  was  less  frequent  among  West 
Africans  (5-10%)  and  intermediate  in  frequency  among 
African  Americans  at  23%.  Strikingly  low  diversity  was 
observed  at  the  GGC  locus  for  Chinese  and  Amerindians. 
Gene  diversity  for  the  two  populations  of  Asian  ancestry 


Table  1  CAG  allelic  diversity  (N  number  of  chromosomes,  H  gene 
diversity) 


Population 

N 

H 

No.  of 
alleles 

Mean  Range 

Variance 

African  American 

516 

0.951 

21 

17.8 

9-31 

10.97 

Sierra  Leone 

230 

0.918 

17 

17.3 

10-26 

7.77 

Nigerian 

83 

0.909 

18 

16.7 

5-28 

17.28 

Euroamerican 

87 

0.866 

11 

19.7 

13-26 

5.37 

Asian 

60 

0.846 

12 

20.1 

14-26 

4.55 

Amerindian 

80 

0.884 

14 

20.1 

1^30 

8.62 

Table  2  GGC  allelic  diversity  (N  number  of  chromosomes,  H  gene 
diversity) 

Population 

N 

H 

No.  of 
alleles 

Mean  Range 

Variance 

African  American 

472 

0.880 

17 

14.3 

4-20 

4.94 

Sierra  Leone 

210 

0.906 

14 

13.7 

^24 

5.50 

Nigerian 

78 

0.771 

10 

13.8 

8-19 

3.44 

Euroamerican 

80 

0.628 

13 

15.0 

2-20 

5.77 

Asian 

60 

0.322 

4 

14.6 

10-16 

1.48 

Amerindian 

103 

0.362 

4 

14.6 

8-16 

1.91 

was  almost  one-third  of  that  observed  for  the  African  pop¬ 
ulations  (Table  2). 

CAG  and  GGC  allelic  distributions  are  shown  in  Fig.  1. 
These  distributions  portray  a  shift  in  the  most  common  al¬ 
lele  among  African  versus  non-African  populations.  Al¬ 
lelic  distributions  were  either  unimodal  or  bimodal  for  all 
populations  except  the  Nigerians  and  Amerindians  (Fig. 
1).  The  multimodal  CAG  allele  distribution  for  the  Niger¬ 
ian  and  Amerindian  populations  may  be  due  to  genetic 
drift.  For  example,  among  Nigerians  the  17  allele  at  the 
CAG  locus  is  rare  (<0.01 ),  unlike  in  the  other  African 
populations.  Among  the  Amerindians  both  the  20  and  22 
CAG  alleles  are  common  while  the  19  and  21  alleles  are 
less  frequent  (Fig.  1). 

As  an  alternative  measure  of  intrapopulation  diversity 
for  the  microsatellite  markers  we  calculated  variances  in 
allele  sizes  for  each  locus  (Tables  1,  2).  Again,  genetic 
drift  operating  within  the  Nigerian  and  Amerindian  popu¬ 
lations  may  have  contributed  to  the  higher  variance  in  num¬ 
ber  of  CAG  repeats  than  in  the  other  populations.  Interest¬ 
ingly,  the  same  trend  was  not  observed  in  the  Nigerian  or 
Amerindian  populations  for  the  GGC  locus  (Table  2).  The 
lowest  variance  for  CAG  allele  size  was  observed  among 
the  Europeans.  Europeans  possessed  about  one-fourth  the 
variance  in  CAG  allele  size  than  among  Nigerians. 

Variance  in  allele  size  for  the  GGC  marker  was  one- 
third  that  of  CAG  allele  size  variance.  The  populations 
with  the  lowest  GGC  allele  size  variance  were  Asians  and 
Amerindians.  A  notable  trend  observed  among  the  vari¬ 
ances  calculated  was  that  variances  for  the  African  popu¬ 
lations  were  almost  twice  that  of  the  non- African  popula¬ 
tions.  This  is  consistent  with  the  findings  of  other  studies 
that  have  examined  microsatellite  diversity.  These  studies 
reveal  higher  gene  diversity  among  African  populations 
and  significant  genetic  differences  between  African  and 
non- African  populations  (Jorde  et  al.  1995,  2000;  Nei  and 
Takezaki  1996;  Reich  and  Goldstein  1998;  Shriver  et  al. 
1997). 


Haplotype  diversity  and  LD 

AR  CAG  and  GGC  haplotype  frequencies  and  D'  values 
were  determined  for  each  population  and  are  available 
from  the  Human  Genome  Diversity  Laboratory.  Haplo¬ 
type  diversity  was  greatest  for  African  populations,  lower 
for  European  Americans,  and  lowest  for  Amerindians 
(Table  3).  The  15  most  common  AR  haplotypes,  their  fre¬ 
quencies,  and  LD  values  are  shown  in  Table  4.  Highly  di¬ 
vergent  haplotypes  were  observed  among  the  African  pop¬ 
ulations.  All  non- African  populations  possessed  the  same 
most  common  haplotype  designated  as  20-15  (20  CAG  re¬ 
peats  and  15  GGC  repeats).  Table  4  reveals  that  the  fre¬ 
quency  of  the  20-15  haplotype  in  non- African  populations 
ranged  from  13%  among  Euroamericans  to  20%  among 
the  Asians.  Populations  of  African  descent  each  possessed 
a  different  common  haplotype.  The  most  common  haplo¬ 
type  among  African  Americans  was  16-16  at  5%  fre¬ 
quency.  Among  Nigerians  three  common  haplotypes  were 
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Fig.l  Allele  frequency  distributions  of  CAG  {left)  and  GGC  {right)  microsatellites  from  the  six  human  populations  included  in  this 
study.  X-axis  Number  of  repeats;  y-axis  frequency 


257 


Table  3  Androgen  receptor  haplotype  diversity  Qi) 


Population 

No.  of 
haplotypes 

h 

African  American 

132 

0.991±0.015 

Sierra  Leone 

92 

0.990±0.002 

Nigerian 

50 

0.992±0.003 

Euroamerican 

40 

0.980±0.007 

Asian 

23 

0.946+0.018 

Amerindian 

29 

0.938+0.020 

observed,  13-14,  19-14,  and  20-14,  each  at  a  frequency  of 
6.5%.  In  Sierra  Leone  the  most  common  haplotype  was 
16-12  at  7.4%.  The  non- African  populations  did  not  pos¬ 
sess  population  specific  haplotypes.  In  fact  the  non- 
African  populations  possessed  a  subset  of  the  variation 
observed  among  the  African  populations.  This  is  similar 
to  observations  using  other  genetic  markers  (Tishkoff  et 
al.  1996,  1998). 

Of  all  possible  pairwise  LD  comparisons  of  polymor¬ 
phic  alleles  at  the  two  loci,  10%  (35  of  357)  were  signifi¬ 
cant  (F<0.05)  for  African  Americans.  The  percentage  of 
significant  D'  values  for  comparisons  of  alleles  among  the 
other  populations  ranged  from  4%  for  the  Chinese  to  8% 
for  Amerindians  and  European  Americans  (data  not 
shown).  These  differences  in  levels  of  LD  could  be  due  to 
recent  admixture,  drift,  substructure,  or  power  to  detect  al¬ 
lelic  associations. 

Although  each  non-African  population  shared  the 
same  common  haplotype  (20-15),  no  LD  was  detected  be¬ 
tween  the  alleles  in  the  three  populations  (Table  4).  This  is 
likely  due  to  the  random  effects  of  drift  operating  differ¬ 
ently  in  populations  after  the  expansion  out  of  Africa  and 
subsequent  mutations  away  from  the  common  haplotype. 
Drift  and  mutation  would  affect  each  population  differ¬ 
ently.  This  could  explain  why  AR  haplotype  diversity  is 
greater  for  the  European  population  than  populations  of 
Asian  ancestry  (Table  3).  Our  inability  to  detect  LD  may 
also  be  due  to  the  smaller  sample  sizes  of  the  non- African 
populations  than  the  African  populations.  Sample  size 
likely  played  a  role  for  the  Chinese  samples.  For  instance, 
D’  for  many  of  the  common  haplotypes  in  China  was 
100%  but  not  significant  (Table  4). 

Table  4  also  reveals  significant  sharing  of  haplotypes 
in  the  two  West  African  populations  indicative  of  shared 
ancestry.  For  instance,  haplotype  19-14  is  common  in 
both  Nigeria  and  Sierra  Leone  and  in  strong  LD  in  both 
populations.  This  is  also  reflected  in  the  African  Ameri¬ 
can  population,  where  about  40%  of  the  significantly  as¬ 
sociated  alleles  were  low  in  frequency  (<0.05  allele  fre¬ 
quency)  and  appeared  to  be  of  African  origin.  Table  4  re¬ 
veals  five  of  these  haplotypes  (13-14,  15-12,  15-15,  16-12, 
and  16-16).  This  is  contrary  to  what  would  be  expected  if 
the  LD  within  the  African  American  population  were  due 
to  admixture  with  Euroamericans. 
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Table  5  Genetic  differentia¬ 
tion  of  populations 


Type  of  comparison 

Variance  (%) 

<I>  Statistic 

P 

Among  groups 

18.5 

<DCT  =  0.185 

0.01 

Among  populations  within  groups 

1.2 

<DSC  =  0.014 

<0.001 

Within  populations 

80.3 

<DST  =  0.197 

<0.001 

Table  6  Pairwise  genetic  dis¬ 
tances  based  on  <I>ST  (below 
the  diagonal)  and  their  signifi¬ 
cance  levels  (above  the  diago¬ 
nal) 


African 

Americans 

European 

Americans 

Amer¬ 

indians 

Nigerians 

Chinese 

Sierra 

Leoneans 

African  Americans 

_ 

0.000 

0.000 

0.019 

0.000 

0.029 

European  Americans 

0.103 

- 

0.000 

0.000 

0.416 

0.000 

Amerindians 

0.224 

0.049 

- 

0.000 

0.089 

0.000 

Nigeria 

0.027 

0.213 

0.338 

- 

0.000 

0.297 

China 

0.137 

-0.002 

0.018 

0.262 

_ 

0.000 

Sierra  Leone 

0.011 

0.212 

0.351 

-0.001 

0.265 

- 

Analysis  of  molecular  variance 

Genetic  variance  (0)  statistics  for  the  AR  trinucleotide  re¬ 
peat  data  are  shown  in  Table  5.  Using  both  molecular  AR 
haplotypic  differences  based  on  microsatellite  repeat- 
length  and  haplotype  frequencies,  AMOVA  revealed  that 
the  AR  trinucleotide  repeat  diversity  is  nonrandomly  dis¬ 
tributed  across  populations.  The  amount  of  genetic  vari¬ 
ance  between  the  six  populations  was  19.7%  (P<0.001). 
The  bulk  of  genetic  variance  for  the  AR  gene  (80.3%) 
could  be  explained  by  individual  differences  within  popu¬ 
lations.  The  OCT  estimate  was  0.185,  revealing  that 
18.5%  of  the  genetic  variance  was  due  to  differences  be¬ 
tween  the  African,  Asian,  and  European  descent  groups 
(Table  5). 

Pairwise  genetic  distances  between  the  populations  are 
provided  in  Table  6.  The  lowest  pairwise  distances  (<0.05) 
between  populations  were  observed  among  the  closely  re¬ 
lated  African  descendant  populations  (African  Americans, 
Nigerians,  and  Sierra  Leone)  and  between  the  European 
and  Asian  populations  (Chinese  and  Amerindian).  High 
genetic  distance  values  (>0.30)  were  observed  between 
divergent  populations  such  as  Amerindians  and  West 
Africans  from  Nigeria  or  Sierra  Leone  (Table  6).  A  mod¬ 
erate  distance  value  of  0.10  between  African  Americans 
and  European  Americans  suggests  a  shared  biohistory.  All 
but  three  of  the  population  pairwise  distances  were  signif¬ 
icant  (P<0.05).  The  nonsignificant  values  reflect  the  close 
genetic  affinities  of  the  three  pairs  of  populations  (Table  6). 


Discussion 

In  order  to  evaluate  the  extent  of  variation  and  linkage 
disequilibrium  between  two  trinucleotide  repeat  loci 
within  the  AR  gene  we  typed  alleles  from  both  markers  in 
six  diverse  human  populations.  Populations  of  African  de¬ 
scent  exhibited  the  highest  gene  diversity  among  the  pop¬ 
ulations  sampled.  The  African  American  population  con¬ 
tained  more  alleles  and  higher  gene  diversity  than  even 
the  indigenous  West  African  populations  from  Sierra 
Leone  and  Nigeria.  Asians  possessed  the  lowest  gene  di¬ 


versity  of  all  populations  and  also  contained  the  lowest 
frequency  of  “high-risk”  short  alleles  for  prostate  cancer. 

Patterns  of  allelic  variation  differed  substantially  be¬ 
tween  the  six  populations.  Our  data  revealed  that  80%  of 
men  of  African  decent  possessed  CAG  alleles  shorter  than 
20  repeats  while  only  50%  of  non-African  men  had  these 
short  alleles  (see  Fig.  1).  The  pattern  was  more  pro¬ 
nounced  for  the  GGC  locus,  where  50%  of  African  men 
had  GGC  alleles  shorter  than  14  while  no  more  than  13% 
of  men  with  European  and  Asian  ancestry  possessed  the 
short  GGC  alleles.  Thus  a  greater  proportion  of  the  haplo- 
types  defined  by  short  alleles  at  both  loci  (<20  CAG  and 
<14  GGC)  appear  to  segregate  in  African  populations 
than  in  Asian  and  European  populations.  In  fact,  the 
African  American  population  possesses  a  mixture  of  short 
allele  haplotypes  from  different  African  populations.  This 
has  never  been  explored  and  is  quite  significant  since  both 
the  CAG  and  GGC  repeat  loci  influence  the  size  of  the 
protein,  which  subsequently  affects  transactivation  of  the 
receptor.  These  results  parallel  the  prevalence  of  prostate 
cancer  in  human  populations.  Populations  in  which  shorter 
CAG  and  GGC  alleles  are  common,  such  as  Africans,  and 
specifically  African  Americans,  have  the  highest  inci¬ 
dence  of  prostate  cancer  in  the  world.  The  other  end  of  the 
ethnic  spectrum  of  prostate  cancer  incidence  reveals  that 
prevalence  among  Asians,  who  possess  larger  trinu¬ 
cleotide  alleles,  may  be  up  to  50-fold  less  (Ross  et  al. 
1996).  Along  with  other  genetic  and  environmental  fac¬ 
tors,  this  could  likely  yield  a  stronger  predisposition 
among  the  African  American  population  for  prostate  can¬ 
cer.  Recently  a  relationship  was  reported  between  serum 
PSA  levels  and  polymorphisms  in  the  PSA  and  AR  genes 
(Xue  et  al.  2001).  Specifically,  serum  PSA  levels  in¬ 
creased  by  7%  with  each  decreasing  AR  CAG  repeat  al¬ 
lele  size  among  individuals  homozygous  for  a  single  nu¬ 
cleotide  polymorphism  in  the  PSA  gene  promoter. 

Our  calculation  of  variance  in  the  number  of  trinu¬ 
cleotide  repeats  provided  a  reliable  measure  of  diversity 
since  the  two  markers  are  microsatellites  that  conform  to 
a  stepwise  mutation  model  (DiRienzo  et  al.  1994;  Valdes 
et  al.  1993).  Larger  variances  and  higher  numbers  of  alle¬ 
les  were  observed  for  the  CAG  locus  than  the  GGC  locus 
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among  all  six  populations.  The  higher  diversity  at  the  CAG 
locus  is  likely  due  to  a  higher  mutation  rate  at  the  CAG  lo¬ 
cus  than  the  GGC  locus.  CAG  repeat  variation  has  been 
shown  to  cause  several  human  diseases,  such  as  Kennedy’s 
disease  (MIM  313200),  Huntington’s  disease  (MIM 
143100),  and  several  forms  of  spinocerebellar  ataxias: 
SCAl  (MIM  164400),  SCA2  (MIM  183090),  SCA3 
(MIM  109150),  and  SCAT  (MIM  164500).  All  of  these 
CAG  repeat  loci  are  polymorphic  in  normal  individuals. 
However,  there  appear  to  be  constraints  on  allele  size  in 
populations  since  disease  results  when  the  CAG  repeat 
lengths  reach  a  certain  threshold.  A  recent  study  of  the 
ERDAl  locus  revealed  that  large  CAG  repeats  are  more 
common  among  Asian  populations,  less  common  in  pop¬ 
ulations  of  European  ancestry,  and  least  common  in 
African  populations  (Deka  et  al.  1999).  This  pattern  is 
very  similar  to  that  which  was  observed  in  our  study  of 
the  AR  trinucleotide  repeats. 

To  explore  population  genetic  affinities  based  on  the 
AR  gene  CAG  and  GGC  repeats  we  performed  an 
AMOVA.  Almost  20%  of  AR  gene  variance  is  attributed 
to  differences  between  populations.  Pairwise  genetic  dis¬ 
tance  values  were  significant  for  all  population  pairs  ex¬ 
cept  those  with  shared  ancestry,  such  as  between  the 
Asian  and  Amerindian  populations  and  Nigerian  and 
Sierra  Leone  populations.  Much  of  the  genetic  differences 
between  populations  may  be  due  to  genetic  drift.  Since 
the  AR  gene  is  X-linked,  it  is  more  vulnerable  to  the  ef¬ 
fects  of  drift  than  similar  markers  on  other  autosomes. 
This  is  due  to  a  lower  recombination  rate  and  smaller  ef¬ 
fective  population  size  for  X-linked  markers  (approx, 
three-fourths  of  non-X-linked  autosomal  markers).  Al¬ 
though  the  estimate  for  AR  genetic  differentiation  (OST) 
between  populations  is  higher  than  non-X-linked  autoso¬ 
mal  markers,  it  is  not  as  high  as  estimates  for  the  haploid 
systems  of  mtDNA  and  the  Y  chromosome  (see  Jorde  et 
al.  2000).  This  is  because  the  effective  population  size  for 
mtDNA  and  the  Y  chromosome  is  about  one-third  lower 
than  for  the  X  chromosome. 

Studies  that  have  examined  the  association  of  alleles  at 
the  AR  CAG  and  GGC  repeat  loci  in  relation  to  the  de¬ 
velopment  of  prostate  cancer  have  been  ambiguous.  Irvine 
et  al.  (1995)  examined  AR  trinucleotide  repeat  variation 
in  prostate  cancer  cases  and  controls  from  three  ethnic 
groups,  Euroamericans,  African  Americans,  and  Asians. 
LD  was  observed  within  the  mixed  group  of  cases  but  not 
within  any  one  ethnic  group.  Several  factors  may  have  led 
to  this  observation.  The  prostate  cancer  cases  consisted  of 
three  diverse  populations,  and  therefore  it  is  highly  likely 
that  stratification  existed  when  they  were  pooled  together. 
Also,  since  the  sample  sizes  of  the  three  groups  were  low 
(<50)  it  is  likely  that  there  was  not  sufficient  statistical 
power  to  detect  LD  between  the  two  markers.  Later  Stan¬ 
ford  and  colleagues  (1997)  examined  a  relatively  large 
sample  of  Euroamericans.  Their  sample  size  of  301  cases 
and  277  controls  failed  to  detect  any  LD  between  the 
markers  within  the  two  groups.  A  larger  study  consisting 
of  582  cases  and  794  controls  (Platz  et  al.  1998)  revealed 
significant  LD  in  both  the  cases  and  controls.  The  Platz  et 


al.  (1998)  finding,  using  mainly  Euroamerican  men,  was 
significant  only  after  they  pooled  alleles  for  the  two  mark¬ 
ers  into  categories  of  fewer  than  23,  23,  and  more  than 
23  repeats. 

It  is  a  general  rule  that  strong  disequilibrium  indicates 
that  two  marker  loci  are  closely  spaced.  However,  it  is  not 
always  true  that  two  closely  spaced  markers  show  dis¬ 
equilibrium.  The  frequencies  of  marker  alleles  and  sample 
size  affect  the  power  to  detect  LD.  Also,  recombination 
and/or  mutation  hotspots  could  affect  LD  by  increasing 
the  chance  that  the  associated  marker  allele  will  change. 
Not  only  is  haplotype  variation  shaped  by  accumulated 
mutation  within  haplotypic  lineages,  it  is  also  fashioned 
by  recombination  events  among  the  lineages.  It  is  unlikely 
that  the  decay  of  LD  and  pattern  of  variability  observed 
for  the  AR  CAG  and  GGC  defined  haplotypes  is  due  to  a 
recombination  hotspot  between  the  markers  since  they  are 
separated  by  only  1  kb,  and  recombination  on  the  X  chro¬ 
mosome  occurs  only  in  women.  However,  recombination 
cannot  be  ruled  out,  especially  since  recombination 
hotspots  are  more  likely  in  areas  of  high  GC-rich  regions 
in  the  genome  (Eisenbarth  et  al.  2000).  Population  history 
can  also  have  an  effect  on  the  extent  of  LD.  Our  observa¬ 
tion  of  no  detectable  LD  among  the  non- African  popula¬ 
tions  may  be  explained  by  recent  population  growth  and 
the  high  mutation  rate  at  the  CAG  repeat  locus.  This  is  in 
contrast  to  the  population  bottleneck  explanation  for 
higher  LD  levels  outside  of  Africa  (Kidd  et  al.  1998; 
Tishkoff  et  al.  1996,  1998).  Our  finding  of  significant  LD 
in  the  African  American  population  is  due  mainly  to  gene 
flow  from  other  populations.  Admixture  between  popula¬ 
tions  with  divergent  allele  frequencies  can  generate  LD 
extended  beyond  30  cM  (Lautenberger  et  al.  2000).  Fi¬ 
nally,  genetic  drift  can  greatly  affect  or  reinforce  existing 
associations.  The  role  of  genetic  drift  in  increasing  or  de¬ 
creasing  LD  may  be  more  significant  among  the 
Amerindians  since  there  were  a  smaller  number  of  alleles 
observed  among  the  Amerindians  than  among  the  other 
populations.  This  is  consistent  with  observations  of  low 
genetic  diversity  among  Amerindians  due  to  their  history 
of  recent  population  bottlenecks  (Kittles  et  al.  1999;  Nei 
and  Roychoudhury  1993;  Urbanek  et  al.  1996). 

As  previously  stated,  the  high  level  of  linkage  disequi¬ 
librium  observed  among  African  Americans  is  likely  due 
to  multiple  sources  of  admixture.  Of  the  significant  allelic 
associations  between  the  trinucleotide  markers  in  the 
African  American  population  from  South  Carolina,  26% 
appear  to  have  originated  from  European  Americans, 
while  39%  were  shared  among  West  African  populations 
from  Nigeria  and  Sierra  Leone.  These  results  suggest  that 
the  LD  generated  in  African  Americans  from  Columbia, 
South  Carolina,  may  be  due  to  recent  migration  of  African 
Americans  from  diverse  rural  communities  following  ur¬ 
banization,  recurrent  gene  flow  from  distinct  West  African 
populations,  and  admixture  with  European  Americans. 
Columbia  is  the  capital  of  South  Carolina  and  is  located  in 
the  center  of  the  state.  Many  African  Americans  migrated 
to  this  region  from  the  coastal  Sea  Islands  and  the  Low 
Country  (Berkeley,  Charleston,  Colleton,  and  Dorchester 
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counties)  during  the  early  1900s.  In  the  late  1700s  the  per¬ 
centage  of  persons  of  African  origin  was  quite  high  in  the 
coastal  areas,  including  the  port  of  Charleston  (ranging 
from  47%  to  93%).  It  also  appears  that  colonial  South 
Carolinians  preferred  certain  African  ethnic  groups  over 
others  as  slaves  (Littlefield  1981;  Morgan  1998).  For  in¬ 
stance,  for  a  period  of  time  in  South  Carolina  enslaved 
Africans  from  Senegambia  were  preferred  over  others 
(Littlefield  1981).  This  preference  was  based  mainly  on 
the  Senegambian’s  familiarity  with  rice  production,  which 
was  the  chief  crop  cultivated  in  the  Carolinas  at  the  time. 
However,  these  preferences  changed  in  time  along  with 
the  changing  slave  economy  in  the  colonies.  The  chang¬ 
ing  trends,  along  with  the  relative  isolation  of  the  coastal 
communities  of  South  Carolina  likely  led  to  diverse  South 
Carolina  African  American  populations.  Subsequently,  di¬ 
vergent  haplotypes  were  brought  together  as  people  left 
the  rural  communities  for  more  urban  areas  such  as  Co¬ 
lumbia, 

This  assessment  of  linkage  disequilibrium  in  the 
African  American  population  is  quite  significant  for  sev¬ 
eral  reasons.  First,  the  high  level  of  stratification  in  the 
African  American  population  may  be  a  confounder  in  dis¬ 
ease  association  studies  if  the  substructure  is  not  con¬ 
trolled  for.  Second,  the  identification  of  high-risk  haplo¬ 
types  is  potentially  more  powerful  in  disease  studies  than 
single  locus  analyses.  We  intend  to  increase  the  resolution 
in  identification  of  these  possible  high-risk  haplotypes  for 
prostate  cancer  by  typing  single  nucleotide  polymor¬ 
phisms  within  the  gene  and  performing  haplotype  analy¬ 
ses.  Ultimately  these  studies  will  provide  a  better  under¬ 
standing  of  the  role  variation  within  the  AR  plays  in  pros¬ 
tate  cancer  etiology. 
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Abstract  The  gene  for  the  p-chain  of  the  high-affinity 
receptor  for  IgE  (FceRIfi)  has  been  proposed  as  a  candi¬ 
date  gene  for  atopy.  A  coding  variant  Glu237Gly  has  been 
studied  in  various  populations  with  asthma  and  atopy,  and 
the  results  were  controversial  for  association  of  the  vari¬ 
ant  with  atopy/asthma.  Because  nasal  allergy  is  a  more 
common  atopic  disease  and  shows  less  remission  than 
asthma,  we  analyzed  whether  the  Glu237Gly  variant  is 
correlated  with  nasal  allergy.  The  study  enrolled  233  pa¬ 
tients  with  nasal  allergy  and  100  control  subjects.  Further, 
three  subgroups  were  selected:  patients  with  perennial 
nasal  allergy  («=149),  Japanese  cedar  pollinosis  (a7=189), 
and  allergy  to  multiple  allergens  {n=45).  The  allele  fre¬ 
quency  of  Gly237  in  the  controls  and  patients  was  0.14 
and  0.20,  and  the  frequency  of  Gly237-positive  subjects 
was  0.23  and  0.356,  respectively.  There  was  a  significant 
association  between  Gly237-positivity  and  nasal  allergy, 
perennial  nasal  allergy,  Japanese  cedar  pollinosis,  and  al¬ 
lergy  to  multiple  allergens.  Among  all  333  subjects  we 
observed  a  significant  relationship  between  Gly237  and 
elevated  levels  of  serum  total  IgE  (>250  lU/ml)  and  very 
high  IgE  (>1000  lU/ml).  Among  patients  positive  for  a 
specific  IgE,  Gly237  was  significantly  associated  with 
high  IgE  for  house  dust,  mite,  and  Japanese  cedar  pollen. 
These  results  suggest  that  the  Glu237Gly  variant  of  the 
FceRip  gene  is  involved  in  the  development  of  nasal  al¬ 
lergy  through  the  process  for  the  production  of  both  spe¬ 
cific  and  nonspecific  IgE  antibodies. 
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Introduction 

Nasal  allergy  is  a  common  atopic  disease,  the  prevalence 
of  which  is  around  20%  or  more  in  Western  countries  (Sly 
1999;  Strachan  et  al.  1997).  Various  factors  such  as  dura¬ 
tion  of  breast  feeding,  maternal  age,  social  class,  heating 
with  wood  or  coal,  and  exposure  to  diesel  exhaust  fumes 
are  believed  to  affect  the  prevalence  of  nasal  allergy  (But- 
land  et  al.  1997;  Duhme  et  al.  1998).  However,  it  is  gen¬ 
erally  accepted  that  the  best  established  risk  factor  for 
nasal  allergy  is  a  family  history  of  allergy,  especially  nasal 
allergy  (Bahna  1992;  Sibbald  and  Rink  1991;  Wright  et  al. 
1994),  indicating  that  genetic  factors  strongly  influence 
nasal  allergy. 

Many  studies  have  been  performed  on  the  genetics  of 
atopy  since  Cookson  et  al.  (1989)  first  described  a  linkage 
between  serum  IgE  level  and  a  DNA  marker  for  chromo¬ 
some  llq  in  British  families.  Regarding  the  chromosome 
1  Iq  some  studies  have  confirmed  the  linkage  of  atopy  and 
bronchial  hyperresponsiveness  to  markers  on  llq  13  (Adra 
et  al.  1999;  Collee  et  al.  1993;  Daniels  et  al.  1996;  van 
Herwerden  et  al.  1995;  Mao  et  al.  1997;  Shirakawa  et  al. 
1994a;  Young  et  al.  1992),  while  others  have  failed  to  find 
the  linkage  (Amelung  et  al.  1992;  Collaborative  Study  on 
the  Genetics  of  Asthma  1997;  Hizawa  et  al.  1992;  Lympany 
et  al.  1992;  Malerba  et  al.  1999;  Ober  et  al.  1998;  Rich  et 
al.  1992;  Wjst  et  al.  1999;  Yokouchi  et  al.  2000).  Mean¬ 
while,  the  gene  for  the  (i-chain  of  the  high-affinity  recep¬ 
tor  for  IgE  (FceRip)  has  been  identified  as  a  candidate 
gene  for  this  linkage  between  atopy  and  1  lql3  (Sandford 
et  al.  1993),  and  two  coding  variants  in  exon  6  of  FceRip^ 
Ilel81Leu/Ilel83Val  and  IlelSlLeu,  are  reported  to  be 
associated  with  atopy  in  British  subjects  (Shirakawa  et  al. 
1994b).  These  variants  subsequently  proved  to  be  rare  in 
other  races,  and  another  coding  variant  Glu237Gly  in 
exon  7  was  identified  as  a  more  common  coding  variant 
(Hill  and  Cookson  1996).  Some  reports  show  an  associa¬ 
tion  of  this  variant  with  atopy  and/or  bronchial  hyperre¬ 
sponsiveness  while  others  do  not.  Thus,  it  appears  to  await 
clarification  whether  this  coding  variant  of  FceRip  influ- 
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Abstract 

Androgens  play  an  important  role  in  the  etiology  of 
prostate  cancer.  The  CYP17  gene  encodes  the 
cytochrome  P450cl7a  enzyme,  which  is  the  rate- 
limiting  enzyme  in  androgen  biosynthesis.  A  T  to  C 
polymorphism  in  the  5'  promoter  region  has  recently 
been  associated  with  prostate  cancer.  However, 
contradictory  data  exists  concerning  the  risk  allele.  To 
investigate  further  the  involvement  of  the  CYP17 
variant  with  prostate  cancer,  we  typed  the 
polymorphism  in  three  different  populations  and 
evaluated  its  association  with  prostate  cancer  and 
clinical  presentation  in  African  Americans.  We 
genotyped  the  CYP17  polymorphism  in  Nigerian  (n  = 
56),  European-American  (n  =  74),  and  African- 
American  (n  =  111)  healthy  male  volunteers,  along 
with  African-American  men  affected  with  prostate 
cancer  («  =  71),  using  pyrosequencing.  Genotype  and 
allele  frequencies  did  not  differ  significantly  across  the 
different  control  populations.  African-American  men 
with  the  CC  CYP17  genotype  had  an  increased  risk  of 
prostate  cancer  (odds  ratio,  2.8;  95%  confidence 
interval,  1.0 -7.4)  compared  with  those  with  the  TT 
genotype.  A  similar  trend  was  observed  between  the 
homozygous  variant  genotype  in  African-American 
prostate  cancer  patients  and  clinical  presentation.  The 
CC  genotype  was  significantly  associated  with  higher 
grade  and  stage  of  prostate  cancer  (odds  ratio,  7.1; 
95%  confidence  interval,  1.4-36.1).  The  risk  did  not 
differ  significantly  by  family  history  or  age.  Our 
results  suggest  that  the  C  allele  of  the  CYP17 
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polymorphism  is  significantly  associated  with  increased 
prostate  cancer  risk  and  clinically  advanced  disease  in 
African  Americans. 

Introduction 

The  incidence  of  prostate  cancer  varies  significantly  across 
ethnic  groups,  with  African-American  men  having  the  high¬ 
est  rates  worldwide  (1-3).  African  Americans  also  appear  to 
present  more  commonly  at  an  advanced  stage  with  aggres¬ 
sive  histology  and  increased  cancer-related  mortality  (4). 
Although  the  more  advanced  cancers  in  African  Americans 
may  be  confounded  by  social  class  and  access  to  health  care, 
there  is  a  critical  need  to  explore  the  etiological  pathways 
(genetic  and  environmental  factors)  that  contribute  to  this 
disparity. 

Because  the  prostate  is  an  androgen-regulated  organ,  an¬ 
drogens  may  play  a  major  role  in  the  etiology  of  prostate 
cancer.  The  CYP17  gene  encodes  the  cytochrome  P450cl7o! 
enzyme  that  catalyzes  two  key  steps  in  the  steroid  biosynthesis 
pathway.  The  first  step  in  the  biosynthesis  pathway  involves  the 
conversion  of  cholesterol  to  pregnenolone  by  CYPllAl.  Sub¬ 
sequently,  pregnenolone  is  converted  to  1 7 a-hydroxy preg¬ 
nenolone  and  then  to  dehydroepiandrosterone,  a  precursor  of 
testosterone,  by  the  P450c17q:  enzyme.  A  T  to  C  polymorphism 
in  the  5'  promoter  region  of  the  CYPl  7  gene  has  been  described 
(5)  which  has  been  associated  with  increased  risk  for  early- 
onset  familial  breast  cancer  (6-8).  Also  denoted  as  the  A2 
allele,  this  single  nucleotide  polymorphism  may  create  an  Spl- 
type  promoter  site.  However  recent  electromobility  shift  assays 
have  not  confirmed  Spl  binding  (9). 

The  CYPl  7  gene  is  a  likely  candidate  for  prostate  cancer, 
which,  like  breast  cancer,  is  hormone-related.  To  date,  four 
studies  have  shown  an  association  of  the  CYP17  gene  and 
prostate  cancer  risk,  however  they  have  been  contradictory  in 
terms  of  which  allele  is  associated.  Two  studies,  from  Sweden 
and  Japan,  suggested  that  the  T  {Al)  allele  was  associated  with 
increased  risk  for  prostate  cancer  (10, 11),  whereas  independent 
studies  from  the  United  States  and  Austria  reported  that  the  C 
(A2)  allele  confers  greater  risk  (12,  13).  It  is  interesting  to  note 
that  the  C  (A2)  allele  is  more  prevalent  among  Asian  popula¬ 
tions  (8,  11,  12,  14).  However,  CYPl  7  allele  and  genotype 
frequencies  do  not  seem  to  differ  between  African  Americans 
and  European  Americans  (8,  12,  15),  unlike  several  other  can¬ 
didate  genes  for  prostate  cancer  which  exhibit  striking  allele 
frequency  differences  that  parallel  differences  in  prostate  can¬ 
cer  incidence  (16-18).  To  date,  no  allele  and  genotype  fre¬ 
quency  data  exists  on  clinically  evaluated  indigenous  Africans 
and  African-American  prostate  cancer  patients.  The  purpose  of 
this  study  was  to  determine  whether  differences  exist  in  CYPl  7 
genotype  frequencies  between  African,  African-American,  and 
European-American  populations  and  whether  the  CYPl  7  poly¬ 
morphism  was  associated  with  prostate  cancer  risk  in  African 
Americans. 
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Subjects  and  Methods 

Unrelated  men  were  enrcdlcd  from  throe  silos  for  a  popula¬ 
tion-based  study  of  genetic  risk  factors  for  pn’tstalo  cancer. 
The  Howard  University  Institiitiona!  r<e\’icw  Board  ap¬ 
proved  the  study,  and  written  conseni  was  obtained  from  all 
subjects.  All  prostate  cancer  cases  were  between  40  and  79 
years  of  age  and  w'erc  diagnosed  with  prcrsiate  cancer  within 
the  last  2  years.  One  hundred  and  eighty-two  African  Amer¬ 
icans  (71  prostate  cancer  patients  and  Ml  healthy  male 
controls)  were  enrolled  from  the  Washington,  DC  area. 
African-American  men  with  pr('>state  cancer  were  recruited 
from  the  Division  of  Urolog\  at  the  Howard  Uni\'ersily 
Hospital  and/or  prostate  cancer  screening  at  the  floward 
University  Cancer  Center.  The  response  rate  among  the 
African-American  cases  was  92^4.  Healthy  African-Ameri¬ 
can  male  volunteers  were  enrolled  among  indi\'iduals  under¬ 
going  regular  physical  exams  at  the  Division  of  Urology  at 
Howard  University  Hospital  and/or  men  participating  in 
screening  programs  lor  prostate  cancer  at  the  Howard  Uni¬ 
versity  Cancer  Center.  The  response  rate  for  the  African 
American  controls  was  9()9f.  The  mean  age  of  prostate 
cancer  patients  wuis  66.3  ±  3.3  years  and.  among  controls, 
57.3  ±  0.8  years.  Clinical  characteristics  including  Gleason 
grade,  PSA,'^  Tumor-Node-Metastasis  stage,  age  at  diagno¬ 
sis,  and  family  history  were  obtained  from  metlical  records. 

Seventy-four  European-American  healthy  male  volun¬ 
teers  (mean  age,  58.5  ±  2.9  years)  were  enrolled  through 
various  prostate  cancer-screening  programs  in  Baltimore, 
MD,  sponsored  by  the  .lohns  Hopkins  Cancer  Center.  Fifty- 
six  healthy  volunteers  (mean  age.  51.9  ±  1.6  years)  belong¬ 
ing  to  the  Edo  ethnic  group  were  enrol ietl  in  Benin  City. 
Nigeria.  Nigerian  males  were  enrolled  through  a  community- 
based  study  of  risk  factors  for  prostate  cancer  during  the 
summer  of  2000  in  collaboration  with  the  Uni\ersiiy  of 
Benin  Teaching  Hospital  in  Benin  Cit).  Nigeria.  The  re¬ 
sponse  rale  among  the  Nigerian  controls  was  85^4.  Blood 
samples  w'cre  collected  from  each  subjec  t.  Ihhnicily  for  all 
groups  was  self-reported,  and  indi\  iduals  of  mixed  ancestiw 
were  not  excluded.  All  healthy  volunteers  had  PSA  le\els 
<4.0  ng/ml  and  normal  digital  rectal  exams. 

Genotyping.  The  genomic  DNA  was  obtained  from  isolated 
lymphocytes  using  cell  lysis,  preneinase  K-lreatment,  protein 
precipitation,  and  DNA  precipitation.  Genotyping  of  the  7To  C 
polymorphism  in  the  promoter  region  of  CYPI7  gene  was 
performed  using  Pyroscciiiencing  ( 19.  20).  The  primers  for  the 
polymorphism  w^ere  designed  from  tlie  published  prcmioter 
sequence  (National  Center  for  Biolechnoh^gy  Information  ac¬ 
cession  no.  M63871).  A  167-bp  fragment  was  amplified  in  a 
50-jLrl  PCR  reaction  containing  30  ng  of  genomic  DNA.  20 
pmol  of  forward  unlabeled  5'-TTC  CAC  .A AG  CjCA  AGA 
GAT  AAC-3'  and  a  reverse  biotin-labeled  primer  (b  indicates 
biotin)  5'-b-GGT  AAG  CAG  CAA  GAG  AGC  CA-3'  and  1  X 
PCR  buffer  II  (Perkin-Elmer).  2  mxi  MgC12.  0.2  nni  dNTP.  and 
AmpliTaq  gold  DNA  pedymerase.  PCR  reactions  were  per¬ 
formed  for  50  cycles:  denaturation  at  95  C  for  30  s.  annealing 
at  .54"C  for  20  s,  and  extension  at  12  C  for  30  s. 

Biotinylated  single-stranded  DNA  fragments  were  gen¬ 
erated  by  mixing  the  PCR  product  with  streptavidin-coated 
paramagnetic  beads  (Dynalbeads  M280;  Dynal,  Norway). 
The  PCR  products  and  Dynal  beads  were  mixed  with  high- 
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salt  buffer  |0.r4  Tween  20.  2  m  NaCl,  0.5  niM  EDTA,  and 
10  niM  Tris-HCI  (pH  7.6)|,  incubated  for  L5  min  at  65°C,  and 
spun  at  1400  rpm  in  a  thermomixer  (Eppendorf)-  Then  the 
material  was  resuspended  in  0.5  M  NaGH  and  incubated  for 
2  min  to  separate  DNA  stands.  Dynal  beads  containing  the 
immobilized  strand  were  w'ashed  in  1  X  annealing  buffer  (20 
HIM  Tris- Acetate  and  5  him  MgAc2)  and  resuspended  in  45 
^1  of  lx  annealing  buffer  and  10  pmol  of  sequencing 
primer.  Then  the  mixture  was  incubated  at  for  2  min 
and  then  cooled  to  room  temperature.  Throughout  the  sample 
preparation  steps,  the  immobilized  fragments  coupled  to 
Dynal  beads  were  processed  using  a  manifold  device  (PSQ 
96  Sample  Preparation  Tool:  Pyroscquencing  AB,  Uppsala, 
Sweden).  An  automated  pyroscquencing  instrument,  PSQ96 
(Pyroscquencing  AB)  was  used  to  perform  genotyping.  The 
reacti(m  was  carried  out  at  25^'C  with  the  sequencing  primer 
5'-GGC  AGG  CAA  GAT  AGA  CA-3'  added  to  the  reaction. 
The  reaction  mixture  also  contained  DNA  polymerase 
(exonuclease-deficient).  40  mil  apyrase,  4  jitg  of  purified 
luciferase/ml,  15  nil)  (4' recombinant  ATP  sulfurylasc,  0.1  M 
Tris-acetate  (pH  7.75),  0.5  niM  EDTA,  5  niM  magnesium 
acetate.  0.1 BSA,  1  nm  DTT,  10  /xm  adenosine  y-phos- 
phosulfate.  0.4  mg  of  poly(vinylpyrolidonc)/ml,  and  100  /xg 
of  i)-luciferin/ml.  The  mini-sequencing  protocol  wuis  carried 
out  by  stepwise  elongation  of  the  primer  strand  upon  se¬ 
quential  addition  of  40  pmol  of  the  different  dcoxynucleo- 
side  triphosphates  and  the  simultaneous  degradation  of  nu- 
clc('itides  by  apyrase.  As  the  sequencing  reaction  continued, 
the  cDN.^  strand  extended  and  the  DNA  sequence  was 
determined  from  the  single  peaks  in  the  pyrogram  using 
Pyroscquencing  software  (Pyroscquencing,  AB).  All  sam¬ 
ples  were  genotyped  twice  directly  from  genomic  DNA. 
Control  DNAs  included  a  known  wild-type  (TT).  a  heterozy¬ 
gous  mutant  (CT).  and  homozygous  mutant  (CC)  variant 
samples.  The  control  DNAs  w'cre  confirmed  by  direct  DNA 
sequencing  using  an  ABI  377  DNA  .sequencer  (ABI,  Foster 
Cit).  CA).  Genotypes  from  the  repeat  assay  were  100% 
concordant  with  initial  genotypes. 

Stati.sfical  Analysis.  Genotype  and  allele  frequencies  were 
calculated  for  each  population.  Hardy-Weinberg  equilibrium 
analysis  of  each  group  was  evaluated  by  contingency  table 
analysis.  The  SA.S  Version  6.12  computer  program  (SAS 
Institute.  Inc..  Cary,  NC.)  was  used  to  compute  the  two-sided 
Pearson  test.  ORs  and  Ps  were  determined  from  a  com¬ 
parison  of  genotypes  in  Nigerians  and  European  Americans 
versus  African-American  healthy  controls.  Genotypes  were 
also  compared  between  African-American  prostate  cancer 
patients  and  healthy  controls.  Regression  analyses  w^ere  used 
to  assess  w'hether  age  at  diagnosis  and  family  history  mod¬ 
ified  the  relationship  between  CYPIl  and  prostate  cancer 
risk.  Regression  analyses  w'cre  also  performed  to  compare 
grade/stage  among  prostate  cancer  patients.  Grade/stage  was 
defined  as  low  (T,,-T,^.  and/or  Gleason  grade  <7)  or  high 
[T;,-T4  or  N  (  +  )  or  M  (  +  )  stage  and/or  Gleason  grade  ^7; 
see  Refs.  16  and  17].  For  the  analyses  of  prostate  cancer 
patients,  the  regression  model  controlled  for  age  at  diagno¬ 
sis.  PSA  (total),  and  family  hisK’iry  (affected  first-degree 
relative). 

Results 

Fig.  1  shows  examples  of  pyrograms  representing  the  CYPI7 
genotypes.  The  C  iA2)  allele  frequency  was  30%  among  the 
African-American  controls.  CYPI7  genotypes  frequencies  in 
the  three  normal  control  populations  are  shown  in  Table  I. 
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Fig.  1.  Pyrograms  of  the  CYP17  genotypes.  The  DNA  sequence  is  CGT/ 
CCACCT.  A,  homozygous  TT  sample;  B,  heterozygous  TC  sample;  C,  homozy¬ 
gous  CC  sample.  E  and  S  at  the  beginning  of  each  pyrogram  denote  the  addition 
of  the  enzyme  and  substrate  respectively.  The  first  T  and  second  G  in  the 
programs  were  negative  controls  that  were  used  as  internal  controls  for  the 
pyrosequencing  reactions. 


Table  1  CYPl  7  genotype  frequencies  in  healthy  controls  from 
three  populations 

Population 

n 

Genotype  n  (%) 

pa 

TT 

TC 

CC 

Nigerians 

56 

24  (43%) 

27  (48%) 

5  (9%) 

0.69 

European  Americans 

74 

28  (38%) 

38(51%) 

8(11%) 

0.29 

African  Americans 

111 

55  (50%) 

46  (41%) 

10  (9%) 

"  Two-sided  P  from  Pearson  tests  comparing  genotype  frequencies  in  each 
population  with  African-Americans. 


Table  2  CYP17  genotype  frequencies  in  African-American  prostate 
cases  and  healthy  controls 

cancer 

Genotype 

Cases 

Controls 

OR 

95%  Cl 

pa 

n  =  l\ 

n  =  111 

TT 

22(31%) 

55  (50%) 

1.0  (Ref.) 

TC 

38  (54%) 

46(41%) 

2.0 

1.0-3.9 

0.03 

CC 

11  (15%) 

10  (9%) 

2.8 

1.0-7.4 

0.04 

TC  ^  CC 

49  (69%) 

56  (50%) 

2.2 

1.2-4. 1 

0.01 

^  Two-sided  P  from  Pearson  tests. 


Table  3  Comparison  of  CYP17  genotype  with  grade/stage''  among  African- 
American  prostate  cancer  patients 

Genotype 

Low 

High 

OR 

95%  Cl 

P^ 

n  =  37 

a 

II 

TT 

16  (43%) 

6(18%) 

1.0  (Ref.) 

TC 

18  (49%) 

20  (58%) 

2.9 

1. 0-9.2 

0.05 

CC 

3  (8%) 

8  (24%) 

7.1 

1.4-36.1 

0.01 

TC  CC 

21  (57%) 

28  (82%) 

3.6 

1.2-10.6 

0.01 

“  Grade/stage  as  defined  as  low  (Tig-Tj^  stage  and/or  Gleason  grade  <7)  or  high 
(T2“T4  or  N  (+)  or  M  (+)  stage  and  Gleason  grade  >7). 

^  P  from  logistic  regression  analyses  controlling  for  age,  PSA,  and  family  history. 


Stratification  of  the  71  African-American  prostate  cancer 
cases  by  grade/stage  is  shown  in  Table  3.  Among  men  het¬ 
erozygous  for  the  CYP17  polymorphism,  58%  (20  of  34)  pre¬ 
sented  with  high  grade/stage  prostate  cancer  compared  with 
49%  (18  of  37)  with  low  grade^tage.  For  the  CC  genotype,  we 
observed  24%  (8  of  37)  of  men  with  high  grade/stage  compared 
with  only  8%  (3  of  37)  of  men  with  low  grade/stage.  ORs 
comparing  TC  genotype  to  TT  between  low  and  high  grade/ 
stage  disease  suggests  an  increased  risk  of  presenting  with  high 
grade/stage  (OR,  2.9;  95%  Cl,  1.0-9.2).  A  stronger  association 
was  observed  when  comparing  CC  genotype  to  TT  (OR  of  7.1; 
95%  Cl,  1.4-  36.1;  P  -  0.01).  Because  of  the  small  number  of 
samples  in  certain  categories,  the  95%  CIs  for  the  ORs  are 
large. 


Genotypes  in  each  population  were  in  Hardy-Weinberg  equi¬ 
librium  {P  >  0.05;  data  not  shown).  Genotype  frequencies  for 
European  Americans  and  African  Americans  were  consistent 
with  previous  published  frequencies  (8,  12,  15).  Pearson  ^ 
tests  revealed  no  significant  differences  in  genotype  frequen¬ 
cies  when  African  Americans  were  compared  with  Nigerians  or 
European  Americans  (Table  1). 

The  presence  of  at  least  one  copy  of  the  C  (A2)  allele  was 
significantly  higher  among  African-American  prostate  cancer 
cases  (69%),  than  among  controls  (50%);  P  =  0.01  (Table  2). 
An  increased  risk  for  prostate  cancer  was  observed  for  individ¬ 
uals  with  at  least  one  copy  of  the  C  allele  (OR,  2.2;  95%  Cl, 
1. 2-4.1).  The  risk  for  prostate  cancer  among  heterozygous 
individuals  was  intermediate  to  those  who  were  homozygous 
for  the  C  allele  (ORs  of  2.0  and  2.8,  respectively;  Table  2).  This 
suggests  a  gene-dosage  effect  where  the  risk  for  prostate  cancer 
increases  with  number  of  C  allele  copies.  Additional  analyses 
were  performed  to  examine  whether  a  relationship  exists  be¬ 
tween  CYP17  genotype  and  age  at  diagnosis  (<66  years  of  age 
versus  >66  years  of  age)  and  family  history  of  prostate  cancer. 
No  relationship  was  observed  between  the  CYP17  polymor¬ 
phism  and  age  of  onset  in  African  Americans  {P  —  0.71). 
Similarly,  no  association  was  observed  with  family  history 
{P  =  0.65;  data  not  shown). 


Discussion 

Prostate  cancer  development  is  influenced  by  androgens, 
which  are  regulated  by  genetic  and  environmental  factors. 
Environmental  factors  such  as  dietary  fat  intake  play  a  role 
in  the  development  of  prostate  cancer  (21).  CYP17  is  an  ideal 
candidate  for  prostate  cancer  because  it  is  directly  involved 
in  the  production  of  testosterone.  In  this  study,  we  examined 
the  role  a  CYP17  promoter  polymorphism  plays  in  prostate 
cancer  among  African  Americans.  The  CYP17  polymor¬ 
phism  is  in  the  promoter  region  and  may  create  an  additional 
Spl-type  site  (CCACC)  34  bases  upstream  of  the  initiation 
of  translation  and  downstream  from  the  transcription  start 
site.  The  presence  of  this  variant  may  result  in  increased 
production  of  testosterone  attributable  to  an  increased  rate  of 
transcription  (5).  This  would  increase  the  bioavailability  of 
testosterone  for  conversion  to  dihydrotestosterone,  ulti¬ 
mately  affecting  prostate  cell  growth.  Kristensen  et  al  (9) 
demonstrated  that  the  CYPl  7  promoter  polymorphism  does 
not  create  an  Spl  binding  site,  but  suggested  that  other 
transcription  factors  might  interact  with  this  polymorphism. 
However,  it  is  possible  that  in  vivo  conditions  may  favor  Spl 
binding  to  the  variant  Spl  site  in  the  prostate,  thus  bringing 
about  increased  transcription  of  the  CYPl  7  gene. 
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Two  important  risk  factors  for  prostate  cancer  are  age  and 
ethnicity.  The  CYPI7  polymorphism  was  significantly  associ¬ 
ated  with  disease  and  aggressiveness,  and  its  effect  did  not 
seem  to  be  modified  by  age  at  diagnosis  or  family  history.  The 
CYP17  association  with  prostate  cancer  among  African  Amer¬ 
icans  may  also  explain  the  higher  circulating  testosterone  con¬ 
centration  in  African-American  men  when  compared  with  other 
ethnic  groups  (18,  22),  because  the  gene  is  directly  involved  in 
testosterone  biosynthesis.  It  is  critical  to  determine  whether 
differences  exist  in  allele  and  genotype  frequency  between 
populations,  because  this  may  help  explain  some  of  the  differ¬ 
ences  between  populations  in  prostate  cancer  prevalence.  In  this 
study  we  showed  that  the  frequency  of  the  CYPI7  variant  was 
consistent  across  control  populations  consisting  of  Nigerians. 
African  Americans,  and  European  Americans.  This  is  an  im¬ 
portant  observation,  because  it  suggests  that  CYPI7  may  not 
account  for  all  of  the  differences  in  testosterone  levels  and 
prostate  cancer  incidence  bctw'cen  populations.  Also,  allele 
frequency  differences  between  populations  can  be  a  confoundcr 
in  association  studies  if  not  controlled  for  (23-25),  especially  in 
genetic  studies  on  the  African-American  population,  which  is 
highly  heterogeneous  because  of  its  African  ancestry  and  recent 
admixture  with  European  Americans. 

This  is  the  first  study  that  investigated  the  relationship 
between  the  CYP17  polymorphism  and  prostate  cancer  in 
African  Americans.  Although  the  observed  association  of  the 
CYPJ7  C  {A2)  allele  with  prostate  cancer  is  consistent  with 
previous  studies  on  Austrians  (13)  and  European  Americans 
from  South  Carolina  (12),  no  association  has  previously  been 
shown  with  clinical  presentation  of  the  disease.  Also,  con¬ 
flicting  results  have  been  published  as  to  the  associated 
allele.  The  T  (A I)  allele  was  associated  wdth  increased  pros¬ 
tate  cancer  risk  in  the  Japanese  (11)  and  the  Swedish  pop¬ 
ulations  (10).  It  has  been  suggested  that  the  CYP17  geno¬ 
types  may  play  either  a  protective  or  a  promoting  role  in 
prostate  cancer  progression,  given  different  environmental 
and/or  genetic  backgrounds  (11).  Different  populations  ex¬ 
hibit  different  environmental  factors  (diet,  lifestyle,  etc.), 
levels  of  genetic  variation,  and  patterns  of  genotype/envi- 
ronment  interactions.  All  of  these  factors  play  a  role  in 
prostate  cancer  progression.  This  may  be  one  of  several 
reasons  for  the  contradictory  results.  Another  reason  could 
be  that  the  T  to  C  promoter  polymorphism  wdthin  the  CYPI7 
gene  is  in  moderate  (or  incomplete)  linkage  disequilibrium 
with  the  actual  disease-related  polymorphism.  It  is  likely  that 
the  disease  allele  is  older  in  age  than  the  promoter  polymor¬ 
phism  because  both  promoter  alleles  (T  and  C)  have  been 
found  to  be  associated  with  the  disease  in  vastly  different 
populations.  Events  such  as  recombination  could  place  the 
disease  allele  on  different  CYPI7  haplotypic  backgrounds, 
and  so  single  marker  studies  w'ould  produce  conflicting 
results.  This  could  be  evaluated  by  screening  the  CYPI7 
gene  for  more  polymorphisms,  estimating  the  level  of  link¬ 
age  disequilibrium,  and  performing  haplotypic  (miiltisitc) 
association  analyses  on  prostate  cancer  in  different  popula¬ 
tions. 

In  summary,  a  common  CYPJ7  variant  w^as  associated 
with  increased  risk  of  prostate  cancer  in  African-American 
men.  Comparison  of  genotypes  revealed  a  significantly  higher 
risk  among  individuals  homozygous  for  the  C  allele  for  devel¬ 
oping  high  grade/stage  prostate  cancer.  In  fact,  African-Amer¬ 
ican  patients  wdth  the  CC  genotype  were  seven  times  more 
likely  to  present  with  more  aggressive  disease.  Because  the 
sample  sizes  were  moderate  for  the  Afriean-American  samples, 
the  results  should  be  interpreted  with  caution  until  larger  studies 


further  evaluate  the  polymorphism.  Future  research  on  the  role 
polymorphisms  within  the  CYPJ7  gene  play  in  prostate  cancer 
and  clinical  presentation  may  demonstrate  a  need  for  genetic 
screening,  possibly  providing  better  treatment  opportunities  or 
prevention  strategics.  How'cvcr.  other  genes  have  been  identi¬ 
fied  that  also  arc  involved  in  prostate  cancer,  and  CYPJ7  may 
play  a  small  but  important  role  in  the  etiology  of  prostate 
cancer. 
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Abstract  CYP3A4-V,  an  A  to  G  promoter  variant  associ¬ 
ated  with  prostate  cancer  in  African  Americans,  exhibits 
large  differences  in  allele  frequency  between  populations. 
Given  that  the  African  American  population  is  genetically 
heterogeneous  because  of  its  African  ancestry  and  subse¬ 
quent  admixture  with  European  Americans,  case-control 
studies  with  African  Americans  are  highly  susceptible  to 
spurious  associations.  To  test  for  association  with  prostate 
cancer,  we  genotyped  CYP3A4-V  in  1376  (2  N)  chromo¬ 
somes  from  prostate  cancer  patients  and  age-  and  ethnic¬ 
ity-matched  controls  representing  African  Americans, 
Nigerians,  and  European  Americans.  To  detect  population 
stratification  among  the  African  American  samples,  10  un¬ 
linked  genetic  markers  were  genotyped.  To  correct  for  the 
stratification,  the  uncorrected  association  statistic  was  di¬ 
vided  by  the  average  of  association  statistics  across  the 
10  unlinked  markers.  Sharp  differences  in  CYP3A4-Vf^Q’ 
quencies  were  observed  between  Nigerian  and  European 
American  controls  (0.87  and  0.10,  respectively;  P<0.0001), 
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African  Americans  were  intermediate  (0.66).  An  associa¬ 
tion  uncorrected  for  stratification  was  observed  between 
CYP3A4'-V and  prostate  cancer  in  African  Americans  {P- 
0.007).  A  nominal  association  was  also  observed  among 
European  Americans  (P=0.02)  but  not  Nigerians.  In  addi¬ 
tion,  the  unlinked  genetic  marker  test  provided  strong  ev¬ 
idence  of  population  stratification  among  African  Ameri¬ 
cans.  Because  of  the  high  level  of  stratification,  the  cor¬ 
rected  P-value  was  not  significant  (P=0.25).  Follow-up 
studies  on  a  larger  dataset  will  be  needed  to  confirm 
whether  the  association  is  indeed  spurious;  however,  these 
results  reveal  the  potential  for  confounding  of  association 
studies  by  using  African  Americans  and  the  need  for 
study  designs  that  take  into  account  substructure  caused 
by  differences  in  ancestral  proportions  between  cases  and 
controls. 


Introduction 

Given  the  role  that  androgens  play  in  prostate  develop¬ 
ment,  genes  involved  in  androgen  biosynthesis  and  me¬ 
tabolism  may  be  important  factors  involved  in  the  etiol¬ 
ogy  of  prostate  cancer.  One  such  gene  may  be  the 
CYP3A4  gene,  a  member  of  the  cytochrome  P450  super¬ 
gene  family  involved  in  the  oxidative  deactivation  of 
testosterone  (Waxman  et  al.  1998).  Recently,  CYP3A4-V, 
an  A  to  G  polymorphism  in  the  nifidipine-specific  ele¬ 
ment  (NSFE)  of  the  5’  regulatory  region  of  the  gene  has 
been  associated  with  higher  Gleason  grade  and  TNM 
stage  (pathologic  system  of  tumor  classification)  prostate 
cancer  (Rebbeck  et  al.  1998;  Paris  et  al.  1999).  The  asso¬ 
ciations  were  most  pronounced  among  men  older  than 
65  years  of  age  with  no  family  history  (Rebbeck  et  al. 
1998;  Paris  etal.  1999). 

It  has  been  suggested  that  decreases  CYP3A4 

protein  activity  thus  increasing  the  availability  of  testos¬ 
terone  (Paris  et  al.  1999;  Rebbeck  2000).  Although  there 
is  no  consensus  on  a  direct  functional  correlation  of  the 
CYP3A4  polymorphism  (Westlind  et  al.  1999;  Amirimani 
et  al.  1999;  Ando  et  al.  1999;  Ball  et  al.  1999),  there  does 
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appear  to  be  a  strong  link  between  high  levels  of  testos¬ 
terone  and  prostate  cancer  development  (Ross  et  al.  1995, 
1998).  Interestingly,  allele  frequencies  for  the  CYP3A4-V 
vary  significantly  between  populations  within  the  US 
(Rebbeck  et  al.  1998;  Paris  et  al.  1999;  Walker  et  al.  1998) 
with  a  trend  similar  to  prostate  cancer  incidence.  The  in¬ 
cidence  of  prostate  cancer  is  roughly  60%  higher  among 
African  American  men  than  European  American  men 
(Ries  et  al.  2000).  Similarly,  CYP3A4-V  allele  frequency 
is  much  higher  among  African  Americans  than  among 
European  Americans  and  Asians  (Rebbeck  et  al.  1998; 
Paris  et  al.  1999;  Walker  et  al.  1998).  These  observations 
are  important  because  the  African  American  population  is 
highly  heterogeneous  because  of  African  ancestry  and  sub¬ 
sequent  admixture  with  European  Americans.  This  unique 
population  history  could  have  major  consequences  for  as¬ 
sociation  studies  when  the  risk  for  disease  varies  consid¬ 
erably  between  African  American  and  European  Ameri¬ 
can  populations.  If  the  frequency  of  a  genetic  polymor¬ 
phism  also  varies  between  the  ethnic  groups,  then  the 
polymorphism  will  appear  to  be  related  to  the  disease 
(Chakraborty  and  Weiss  1988;  Lander  and  Schork  1994; 
Wacholder  et  al.  2000).  This  confounding  resulting  from 
population  stratification  is  of  special  concern  to  genetic 
epidemiologists,  especially  with  the  increased  attention  on 
single  nucleotide  polymorphisms  (SNPs)  to  facilitate  pop¬ 
ulation-based  methods  for  genetics  studies  of  complex 
disease  (Collins  et  al.  1997;  Kruglyak  1999).  Unfortunately, 
the  underlying  population  structure  may  not  be  known,  and 
crude  proxies  such  as  “race”  may  not  sufficiently  resolve 
the  level  of  stratification  that  may  exist  within  the  popula¬ 
tion.  Recently,  it  has  been  proposed  that  unlinked  genetic 
markers  be  typed  in  order  to  detect,  quantify,  and  correct  for 
stratification  in  case-control  studies  (Pritchard  and  Rosen¬ 
berg  1999;  Pritchard  et  al.  2000;  Devlin  and  Roeder  1999; 
Schork  et  al.  2001;  Reich  and  Goldstein  2001).  The  ratio¬ 
nale  behind  the  unlinked  marker  analysis  is  straightforward. 
If  stratification  exists,  not  only  would  the  candidate  marker 
be  associated  but  also  unlinked  markers. 

Genetic  association  studies  on  prostate  cancer  in  African 
Americans  are  particularly  vulnerable  to  confounding  at¬ 
tributable  to  population  stratification  because  one  of  the 
major  risk  factors  for  prostate  cancer  is  ethnicity  (Ross  et 
al.  1998;  Greenlee  et  al.  2000).  The  incidence  of  prostate 
cancer  varies  significantly  across  ethnic  groups,  with 
African  American  men  having  the  highest  rates  world¬ 
wide;  they  are  almost  two  times  more  likely  to  develop 
prostate  cancer  than  European  American  men  (Greenlee 
et  al.  2000).  The  ethnic  variation  in  prostate  cancer  inci¬ 
dence  suggests  that  genetic  factors  in  combination  with 
environmental  factors  play  a  vital  role  in  determining 
prostate  cancer  risk  (Ross  et  al.  1998).  In  addition,  several 
candidate  genes  for  prostate  cancer  exhibit  large  allele 
frequency  differences  between  African  Americans  and 
European  Americans.  These  genes  include  the  CAG-re- 
peat  tract  within  the  androgen  receptor  gene  (Irvine  et  al. 
1995;  Sartor  et  al.  1999;  Kittles  et  al.  2001a),  a  TA-repeat 
tract  within  the  SRD5AR  gene  (Reichardt  et  al.  1995),  and 
the  CYP3A4  promoter  variant  examined  here. 


In  this  study,  we  have  used  a  case-control  association 
design  to  determine  whether  CYP3A4-V \s  associated  with 
prostate  cancer  in  African  Americans  after  controlling  for 
population  stratification.  First,  we  have  examined  whether 
the  CYP3A4  association  observed  among  African  Ameri¬ 
cans  exists  in  other  populations  by  comparing  data  from 
two  populations  ancestral  to  African  Americans,  Nigeri¬ 
ans,  and  European  Americans.  Second,  using  the  three 
groups  of  self-reported  ethnicities,  we  performed  Cochran- 
Mantel-Haenszel  tests  to  determine  if  a  possible  associa¬ 
tion  exists  between  the  CYP3A4  variant  and  prostate  can¬ 
cer.  Then,  to  assess  population  stratification  directly 
within  the  African  American  population,  we  have  typed 
10  unlinked  (unrelated  to  prostate  cancer)  autosomal  ge¬ 
netic  markers,  which,  like  the  CYP3A4  variant,  exhibit 
large  differences  (>40%)  in  allele  frequencies  (5)  between 
Africans  and  Europeans  (Parra  et  al.  1998,  2001).  To  cor¬ 
rect  for  the  stratification,  we  used  the  method  of  Reich 
and  Goldstein  (2001),  which  utilizes  the  association  sta¬ 
tistics  observed  at  the  10  unlinked  markers  to  determine 
whether  the  significant  association  at  the  candidate  marker 
truly  indicates  the  presence  of  a  disease-related  gene.  We 
observed  a  strong  association  between  the  CYP3A4  geno¬ 
type  and  prostate  cancer,  in  addition  to  significant  popula¬ 
tion  stratification  in  African  Americans.  The  subsequent 
test  for  the  association  of  CYP3A4-V  with  prostate  cancer 
(controlling  for  the  stratification)  is  not  significant.  Our 
results  show  that  genetic  association  studies  with  African 
Americans  are  highly  susceptible  to  confounding  because 
of  population  stratification. 


Subjects  and  methods 

Study  subjects 

Unrelated  men  were  enrolled  from  three  sites  for  a  population- 
based  case-control  study  of  risk  factors  for  prostate  cancer  (Table  1). 
The  study  was  approved  by  the  Howard  University  Institutional 
Review  Board  and  written  consent  was  obtained  from  all  subjects. 
All  prostate  cancer  cases  were  between  40  to  85  years  of  age  and 
were  diagnosed  with  prostate  cancer  within  the  last  2  years.  Afri¬ 
can  Americans  («~220;  84  prostate  cancer  patients  and  136  healthy 


Table  1  Clinical  characteristics  of  the  prostate  cancer  {Pea)  cases 
and  healthy  controls  {PSA  prostate-specific  antigen,  NA  informa¬ 
tion  not  collected) 


Population 

n 

Mean  age 
(±SEM) 

Mean  PSA 
(±SEM) 

Family 

history 

African  Americans 

220 

Pea  cases 

84 

67.8(1.2) 

24.3  (4.7) 

15  (18%) 

Controls 

136 

63.3  (0.8) 

2.3  (0.6) 

27  (20%) 

Nigerians 

159 

Pea  cases 

77 

73.1  (1.1) 

169.8  (19.1) 

NA 

Controls 

82 

71.9(1.6) 

1.5  (0.5) 

NA 

European  Americans 

309 

Pea  cases 

215 

63.4  (0.7) 

20.4  (2.7) 

21  (10%) 

Controls 

94 

60.1  (0.6) 

1.2  (0.4) 

6  (7%) 
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male  controls)  were  recruited  from  the  Washington,  DC  area 
through  the  Division  of  Urology  at  the  Howard  University  Hospi¬ 
tal  and/or  prostate  cancer  screening  at  the  Howard  University  Can¬ 
cer  Center.  The  response  rate  among  the  African  American  cases 
was  92%.  Healthy  African  American  male  volunteers  were  en¬ 
rolled  among  individuals  undergoing  regular  physical  examina¬ 
tions  at  the  Division  of  Urology  at  Howard  University  Hospital 
and/or  men  participating  in  screening  programs  for  prostate  cancer 
at  the  Howard  University  Cancer  Center.  The  response  rate  for  the 
African  American  controls  was  90%.  European  Americans  {n— 
309;  215  prostate  cancer  patients  and  94  healthy  male  volunteers) 
were  enrolled  through  various  prostate-cancer-screening  programs 
in  Baltimore,  sponsored  by  the  Johns  Hopkins  Cancer  Center.  For 
the  African  Americans  and  European  American  participants,  eth¬ 
nicity  was  self-reported,  and  individuals  of  mixed  ancestry  were 
not  excluded.  In  addition,  Nigerians  («=159;  77  prostate  cancer  pa¬ 
tients  and  82  healthy  male  controls)  belonging  to  the  Yoruba  eth¬ 
nic  group  were  enrolled  through  the  University  College  Hospital  in 
Ibadan,  Nigeria  and  through  a  community-based  study  of  risk  fac¬ 
tors  for  prostate  cancer  during  the  summer  of  2000  in  collaboration 
with  the  University  of  Benin  Teaching  Hospital  in  Benin  City, 
Nigeria.  The  response  rate  among  the  Nigerian  controls  was  85%. 
Blood  samples  were  collected  from  each  subject.  Clinical  charac¬ 
teristics,  including  Gleason  grade,  prostate-specific  antigen  (PSA), 
TNM  stage,  age  at  diagnosis,  and  family  history,  were  obtained 
from  medical  records.  A  positive  family  history  was  determined  by 
having  a  first-degree  relative  affected  with  prostate  cancer.  Among 
the  African  American  subjects,  18%  of  the  prostate  cancer  cases 
and  20%  of  the  healthy  volunteers  reported  a  positive  family  his¬ 
tory  of  prostate  cancer.  For  the  European  American  subjects,  10% 
of  the  prostate  cancer  cases  and  7%  of  the  healthy  volunteers  re¬ 
ported  family  history.  No  family  history  data  was  collected  for  the 
Nigerian  samples.  All  healthy  controls  had  PSA  levels  less  than 
4.0  ng/ml  and  normal  digital  rectal  examinations.  The  mean  age  at 
diagnosis  among  all  prostate  cancer  patients  was  64±1.1  years. 
The  mean  age  among  the  controls  was  63±1.6  years. 


Genotyping 

Genomic  DNA  was  isolated  from  lymphocytes  by  standard  pro¬ 
teinase  K  digestion,  cell  lysis,  protein  precipitation,  and  DNA  pre¬ 
cipitation.  Genotyping  of  the  CYP3A4  A  to  G  polymorphism  was 
performed  by  using  polymerase  chain  reaction  (PCR)  and  restric¬ 
tion  digestion.  PCR  amplification  of  the  polymorphism  was  car¬ 
ried  out  with  200  nM  forward  primer  (5’-GGA  CAG  CCA  TAG 
AG  A  CAA  GGG  GA-3’)  and  200  nM  reverse  primer  (5’-CAC 
TCA  CTG  ACC  TCC  TTT  GAG  TTC  A-3’),  which  produced  a 
190-bp  fragment.  The  PCR  mix  consisted  of  30  ng  genomic  DNA, 
12  pmol  each  primer,  1.25  U  AmpliTaq  polymerase  (Perkin  Elmer, 
Foster  City,  Calif),  lOx  PCR  buffer  II  (Perkin  Elmer),  1.6  mM 
MgCl2,  0.7  mM  dNTP,  and  10%  dimethylsulfoxideDMSO  (Sigma, 
St,  Louis,  Mo.)  in  a  total  volume  of  25  pi.  Reaction  conditions  in¬ 
cluded  an  initial  melting  step  at  95 °C  for  5  min,  followed  by  35  cy¬ 
cles  of  melting  at  95  °C  for  30  s,  annealing  at  60°C  for  25  s,  and  ex¬ 
tending  at  12°C  for  30  s.  A  final  extension  was  set  at  72°C  for 
4  min.  Restriction  enzyme  digestion  was  performed  on  the  PCR 
fragment  in  10  pi  PCR  product,  2  pi  lOx  buffer  II,  0.2  pi  lOOx 
bovine  serum  albumin,  2  pi  (10  U)  MboW  (New  England  Bio- 
Labs),  and  5.8  pi  water  in  a  total  volume  of  20  pi  and  incubated  at 
37°C  overnight.  The  resultant  fragments  were  electrophoresed  on  a 
4%  agarose  gel  containing  ethidium  bromide.  Bands  were  then  vi¬ 
sualized  by  UV  trans-illumination.  All  samples  were  assayed  in 
duplicate  directly  from  genomic  DNA  together  with  a  set  of  con¬ 
trol  DNAs  that  included  known  homozygous  AA  and  GG  and  het¬ 
erozygous  AG  genotypes.  These  control  DNAs  were  confirmed  by 
direct  DNA  sequencing  in  an  ABI  377  DNA  sequencer  (Applied 
Biosystems,  Foster  City,  Calif). 

In  addition,  ten  autosomal  markers  {APOAl,  ATS,  FY,  ICAMl, 
LPL,  D11S429,  OCA2,  RBI,  Sbl9.3,  and  GC)  were  genotyped  in 
the  African  American  samples  by  standard  PCR  and  elec¬ 
trophoretic  separation  of  DNA  fragments.  APOAl  and  Sbl9.3  are 


ALU  polymorphisms,  is  a  68-bp  insertion/deletion  polymor¬ 
phism,  and  FY,  ICAMl,  LPL,  OCA2,  RBI,  GC,  and  D11S429  are 
SNPs  typed  by  restriction  enzymes.  The  primer  sequences  and 
PCR  conditions  for  the  ten  loci  are  described  in  detail  in  Parra  et 
al.  (1998,  2001). 


Statistical  analysis 


Genotype  and  allele  frequencies  were  calculated  for  each  popula¬ 
tion.  Hardy- Weinberg  equilibrium  analyses  for  each  population 
were  evaluated  by  contingency  table  analysis.  Two-sided  Pearson 
chi-square  (x^^),  odds  ratios,  and  P-values  were  determined  from 
comparisons  of  individual  and  combined  genotype  classes  between 
prostate  cancer  patients  and  healthy  controls  for  each  of  the  three 
populations.  Regression  analyses  were  used  to  assess  whether  age 
at  diagnosis  and  family  history  modified  the  relationship  between 
CYP3A4  and  prostate  cancer  risk.  In  addition,  the  Cochran-Mantel- 
Haenszel  y}  statistic  was  used  to  test  for  association  of  prostate 
cancer  after  adjusting  for  the  different  ethnic  groups.  To  detect 
stratification  within  the  African  American  population,  Pearson  yA 
tests  for  association  with  prostate  cancer  were  performed  on  geno¬ 
types  at  each  of  the  10  unlinked  markers.  The  sum  of  the  test  sta¬ 
tistics  for  each  locus  was  then  computed  with  the  number  of  de¬ 
grees  of  freedom  (df)  being  equal  to  the  sum  of  the  number  of  df 
of  the  individual  loci  (Pritchard  and  Rosenberg  1999). 

To  correct  for  the  stratification,  the  mean  of  the  unlinked 


marker  test  statistics 


was  determined.  A  95%  upper  confi¬ 


dence  limit  on  the  mean  value  (|i)  was  determined  by  multiplying 
the  mean  by  1.83  (based  on  using  10  markers  and  the  distribu¬ 
tion  in  the  absence  of  stratification;  see  Reich  and  Goldstein  2001). 
The  candidate  marker  {CYP3A4)  y^p  value  was  divided  by  the  ad¬ 
justed  mean  (ji)  of  the  unlinked  makers  resulting  in  a  y}  corrected 
for  stratification  (X^orr)  ^  conservative  P-value.  Regression 
analyses  were  used  to  compare  Gleason  grade  and  TNM  stage 
among  prostate  cancer  patients.  Prostate  cancer  patients  were  de¬ 
fined  as  “Low”,  i.e.,  a  Tla-Tlc  and/or  Gleason  grade  larger  than  7, 
or  “High”,  i.e.,  the  T2~T4  or  N  (+)  or  M  (+)  stage  and/or  the  Glea¬ 
son  grade  equal  to  or  larger  than  7  (see  Rebbeck  et  al.  1998;  Paris 
et  al.  1999).  In  the  regression  model,  age  at  diagnosis,  PSA  (total), 
and  family  history  (affected  first  degree  relative)  were  controlled 
for  among  prostate  cancer  patients.  The  SAS  Version  6.12  com¬ 
puter  program  (SAS  Institute,  Cary,  N.C.)  was  used  to  compute  all 
y}  tests,  odds  ratios,  and  /’-values. 


Results 

CYP3A4-V  frequencies  across  populations 

The  CYP3A4-V allele  frequency  was  highest  among  Nige¬ 
rians  (87%),  lowest  among  European  Americans  (10%), 
and  intermediate  among  African  Americans  (66%).  Previ¬ 
ous  reports  (Paris  et  al.  1999;  Walker  et  al.  1998)  estimate 
the  allele  frequency  for  African  Americans  at  about  53%. 
The  higher  CYP3A4  allele  frequency  in  our  sample  of  un¬ 
related  African  Americans  may  be  attributable  to  differ¬ 
ences  in  levels  of  admixture  among  geographically  di¬ 
verse  African  American  communities.  Genotype  frequen¬ 
cies  of  CYP3A4-V  in  the  three  ethnic  populations  are 
shown  in  Table  2.  Genotypes  in  each  population  were  in 
Hardy- Weinberg  equilibrium  (P>0.05).  Genotype  frequen¬ 
cies  differed  significantly  among  the  three  control  popula¬ 
tions  (P<0.001). 
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Table  2  CYP3A4  genotype 
frequencies  in  prostate  cancer 
cases  and  healthy  controls 
from  three  populations  (n  num¬ 
ber  of  subjects,  OR  odds  ratio, 
Cl  confidence  interval).  The 
two-sided  P  was  obtained  from 
Pearson  tests 


Genotype 

Cases 

Controls 

OR 

95%  Cl 

P-value 

African  Americans 

oo 

II 

s: 

n=136 

AA 

4  (5%) 

23  (17%) 

1.0  (ref) 

AG 

32  (38%) 

44  (32%) 

4.2 

1.3-13.2 

0.01 

GG 

48  (57%) 

69  (51%) 

4.0 

1.3-12.3 

0,01 

AG+GG 

80  (95%) 

113(83%) 

4.1 

1.3-12.2 

0.007 

European  Americans 

«=215 

/2=94 

AA 

161  (75%) 

82  (88%) 

1.0  (ref) 

AG 

28(13%) 

6  (6%) 

2.3 

0.9-5.9 

0.06 

GG 

26  (12%) 

6  (6%) 

0.8 

0,8-5,6 

0.09 

AG+GG 

54  (25%) 

12  (12%) 

2,3 

1. 1-4.5 

0.02 

Nigerians 

n=ll 

«=82 

AA 

3  (4%) 

1  (1%) 

1.0  (ref) 

AG 

23  (30%) 

20  (24%) 

0.4 

0.3-3.9 

0.41 

GG 

51  (66%) 

61  (75%) 

0.3 

0. 1-2.8 

0.25 

AG+GG 

74  (96%) 

81  (99%) 

0.3 

0. 1-2.9 

0.28 

:YP3A4-V 

cans. 

they  failed  to  be  significant  (P>0.06).  Similar  to 

with  prostate  cancer 

The  relationship  between  CYP3A4-V  and  prostate  cancer 
is  presented  in  Table  2.  Genotypes  were  compared  among 
the  three  individual  populations  of  cases  and  controls.  For 
the  African  Americans,  a  strong  association  was  observed 
with  CYP3A4  genotype.  Individuals  with  at  least  one  copy 
of  the  variant  allele  were  at  increased  risk  for  prostate 
cancer,  with  an  odds  ratio  (OR)  of  4.1  and  a  95%  confi¬ 
dence  interval  (Cl)  of  1.3-12.2  (P==0.007).  Further  analy¬ 
ses  revealed  no  relationship  with  CYP3A4  and  age  of  di¬ 
agnosis  or  family  history  among  the  African  Americans. 
An  OR  of  2.3  (95%  Cl  1 .1-4.5;  P=0.02)  was  observed  for 
CYP3A4  genotype  and  prostate  cancer  among  European 
Americans;  however,  the  P-value  was  not  significant  after 
correction  for  multiple-tests.  Whereas  the  ORs  were  2.3 
(95%  Cl  0.9-5.9)  and  0.8  (95%  Cl  0.8-5 .6)  for  the  AG 
and  GG  genotypes,  respectively,  among  European  Ameri- 


African  Americans,  the  age  of  diagnosis  and  family  his¬ 
tory  among  the  European  Americans  were  not  associated 
with  CYP3A4  genotype  (data  not  shown).  No  association 
between  CYP3A4  genotype  and  prostate  cancer  was  ob¬ 
served  among  Nigerians  (P=0.28).  The  variant  allele  was 
common  among  Nigerians  with  96%  (74  of  77)  of  pros¬ 
tate  cancer  patients  and  a  striking  99%  (81  of  82)  of 
healthy  controls  possessing  at  least  one  copy  of  the  allele 
(Table  2).  Because  of  the  high  frequency  of  CYP3A4-V,  a 
much  larger  sample  size  would  be  needed  to  detect  an  as¬ 
sociation  with  prostate  cancer. 

The  standard  Cochran  Mantel  Haenszel  test  was 
employed  to  test  for  association  between  CYP3A4-V  and 
prostate  cancer  in  an  attempt  to  control  for  differences 
among  the  three  populations.  Results  of  the  analysis  also 
indicated  a  strong  association  between  the  CYP3A4  geno¬ 
type  and  prostate  cancer  while  controlling  for  differences 
among  the  three  populations  of  self-reported  ethnicities 


Table  3  Comparison  of 
CYP3A4  genotype  with  grade/ 
stage®  among  prostate  cancer 
patients  (n  number  of  subjects, 
OR  odds  ratio,  Cl  confidence 
interval).  The  P-value  was 
taken  from  logistic  regression 
analyses  controlling  for  age, 
PSA,  and  family  history 


®Grade/stage  as  defined  as  Low 
(Tla-Tlc  stage  and/or  Gleason 
grade  <7)  or  High  (T2--T4  or 
N  (+)  or  M  (+)  stage  and  Glea¬ 
son  grade  >7) 


Genotype 

Low 

High 

OR 

95%  Cl 

P-value 

African  Americans 

«-33 

w-51 

AA 

2  (6%) 

2  (4%) 

1.0  (ref) 

AG 

15  (45%) 

17  (33%) 

0.9 

0. 1-7.0 

0.90 

GG 

16  (49%) 

32  (63%) 

0.5 

0. 1-3.8 

0.50 

AG  +  GG 

31  (94%) 

49  (96%) 

0.6 

0.1-4.7 

0.65 

European  Americans 

n=129 

/i=82 

AA 

90  (70%) 

66  (80%) 

1.0  (ref) 

AG 

31  (24%) 

13  (16%) 

1.7 

0.9~3,5 

0.12 

GG 

8  (6%) 

3  (4%) 

1.9 

0.4-7.6 

0.33 

AG+GG 

39  (30%) 

16  (20%) 

1.8 

0.9-3.4 

0.08 

Nigerians 

/2=8 

/i=69 

AA 

1  (13%) 

2  (3%) 

1.0  (ref) 

AG 

1  (13%) 

22  (32%) 

0.9 

0. 1-2.0 

0.07 

GG 

6  (74%) 

45  (65%) 

0.3 

0.2- 3.4 

0.28 

AG+GG 

7  (87%) 

67  (97%) 

0.2 

0.1-2.6 

0.18 

(X2=10.07,  P-0.002;  0R=  2.35,  95%  Cl  1.4-3.9).  Strati¬ 
fied  analyses  of  CYP3A4  genotypes  and  clinical  charac¬ 
teristics  in  the  ethnic  populations  are  shown  in  Table  3. 
These  analyses,  which  controlled  for  age,  PSA,  and  fam¬ 
ily  history,  revealed  that  no  association  was  observed  be¬ 
tween  CYP3A4  genotype  and  the  combined  Gleason  grade 
and  TNM  stage  in  any  of  the  three  ethnic  populations. 
Sample  sizes  within  several  of  the  categories,  particularly 
among  the  Nigerians  were  low  and  may  have  contributed 
to  our  inability  to  detect  a  relationship  between  genotype 
and  clinical  characteristics. 


Testing  and  correcting  for  population  stratification 

We  tested  for  population  stratification  by  comparing  10  un¬ 
linked  autosomal  genetic  markers  with  prostate  cancer 
in  African  Americans.  Table  4  reveals  that  three  of  the 
10  marker  loci  were  nominally  to  strongly  associated  with 
prostate  cancer  in  African  Americans:  GC*1F  (P-0.003), 
OCA2*]  (P-0.020),  andPPi*;  (P-0.047).  GC*7F  is  one 
of  several  alleles  at  the  group-specific  component  locus 
(Mastana  et  al.  1996),  and  OCA2*l  is  a  SNP  in  exon  10  of 
the  P-gene,  a  transporter  protein  involved  in  melanogene- 
sis  (Lee  et  al.  1995).  RBI  *1  is  a  polymorphism  within  the 
tumor  suppressor  retinoblastoma  gene  (Zheng  and  Lee 
2001).  Given  what  is  known  about  the  fonction  of  these 
three  genes,  it  is  unlikely  that  any  of  them  plays  a  role  in 
prostate  cancer.  Overall,  the  test  for  population  stratifica¬ 
tion  using  all  10  of  the  unlinked  markers  was  highly  sig¬ 
nificant  (x^=29.9;  df=10;  P-0.008;  Table  4). 

The  individual  test  statistics  for  the  10  unlinked  mark¬ 
ers  were  also  used  to  correct  for  the  population  stratifica¬ 
tion  in  the  African  American  samples.  Specifically,  the 
mean  of  the  10  marker  statistics  (2.98)  was  used  to 
correct  the  initial  Pearson  calculated  for  the  CYP3A4 
comparison  with  prostate  cancer  in  African  Americans. 
This  statistic,  which  corrected  for  the  level  of  stratifi¬ 
cation,  was  not  significant  P=0,25). 


Table  4  Test  for  stratification  by  comparison  of  unlinked  mark¬ 
ers®  with  prostate  cancer  in  African  Americans 


Marker 

Locus 

P-value*^ 

APOAl*l 

llq23 

3.75 

0.055 

AT3*1 

Iq23-q25 

2.40 

0.122 

GC*1F 

4ql2-<il3 

12.79 

0.0003 

FY-Null*l 

Iq22-q23 

0.06 

0.806 

ICAM1*1 

19pl3 

0.05 

0.820 

LPL*1 

8p22 

0.72 

0.395 

OCA2*l 

15qlL2-ql2 

5.38 

0.020 

RB1*1 

13ql4.3 

3.93 

0.047 

SB19.3*1 

19 

0.0002 

0.987 

D11S429*! 

11 

0.82 

0.365 

TOTAL 

29.9 

0,0008 

®A11  loci  are  unlinked  genetically,  except  FY  and  AT3,  which  are 
*^22  cM  apart  on  chromosome  1 
^Two-sided  P-value  from  Pearson  y}  tests 


Discussion 

Even  though  case-control  association  studies  may  be 
powerful  for  detecting  the  non-random  association  be¬ 
tween  an  allele  and  a  trait,  they  are  vulnerable  to  con¬ 
founding  because  of  population  stratification  (Chakra- 
borty  and  Weiss  1988;  Lander  and  Schork  1994).  Popula¬ 
tion  stratification  can  be  caused  by  various  circumstances 
(Reich  and  Goldstein  2002).  One  example  is  when  two  or 
more  groups  with  different  allele  frequencies  are  pooled 
together  in  the  case  and  control  samples  under  study.  In 
addition,  admixture  can  create  population  stratification. 
This  is  especially  the  case  for  the  African  American  pop¬ 
ulation,  which,  because  of  its  unique  population  history, 
represents  a  varied  mixture  of  African,  European,  and  Na¬ 
tive  American  ancestry.  The  problem  of  stratification  is 
compounded  when  the  disease  of  interest  is  more  preva¬ 
lent  in  one  of  the  populations,  as  is  the  case  with  prostate 
cancer  among  African  Americans.  Any  alleles  that  are 
more  common  among  African  Americans  will  tend  to  be 
associated  with  the  disease,  even  if  it  is  completely  un¬ 
linked  to  the  disease-causing  locus.  Several  approaches 
have  been  attempted  that  deal  with  the  problem  of  popu¬ 
lation  stratification.  One  approach  is  to  match  the  ethnic 
backgrounds  of  cases  and  controls.  However,  consider¬ 
able  “cryptic”  or  hidden  stratification  may  still  remain 
(Ewens  and  Spielman  1995).  Another  approach  has  been 
to  collect  controls  from  families  of  affected  individuals. 
Depending  on  the  study,  family-based  control  methods 
such  as  the  transmission  disequilibrium  test  (TDT)  and  a 
related  method  (sib-TDT;  Ewens  and  Spielman  1995; 
Spielman  and  Ewens  1998)  are  more  difficult  and  costly 
than  collecting  unrelated  cases  and  controls.  For  diseases 
with  a  late  age  of  onset,  such  as  prostate  cancer,  the  avail¬ 
ability  of  parents  and  siblings  for  sampling  is  greatly  re¬ 
duced.  Furthermore,  in  some  instances,  TDT-type  designs 
may  exhibit  less  power  compared  with  case-control  stud¬ 
ies  because  of  over-matching  of  unaffected  sibs  to 
probands  (Risch  2000;  Risch  and  Teng  1998;  Morton  and 
Collins  1998).  Recently,  it  has  been  proposed  that  un¬ 
linked  genetic  markers  should  be  typed  in  order  to  detect, 
quantify,  and  correct  for  stratification  in  the  case-control 
study  (Pritchard  and  Rosenberg  1999;  Pritchard  et  al. 
2000;  Devlin  and  Roeder  1999;  Schork  et  al.  2001;  Reich 
and  Goldstein  2001). 

In  this  study,  we  have  shown  that  population  stratifica¬ 
tion  is  a  potential  problem  for  association  studies  in  the 
African  American  population  when  there  are  differences 
in  allele  frequencies  between  the  parental  populations. 
Our  results  on  CYP3A4-V,  a  candidate  gene  polymor¬ 
phism  for  prostate  cancer,  have  revealed  that  the  promoter 
allele  frequency  differs  significantly  between  populations 
ancestral  to  African  Americans.  Not  surprisingly,  a  strong 
association  was  observed  between  CYP3A4  genotype  and 
prostate  cancer  in  African  Americans.  This  association  is 
consistent  with  previous  studies  of  CYP3A4  and  prostate 
cancer  in  African  Americans  (Paris  et  al.  1999).  In  the 
previous  work,  population  substructure  within  the  African 
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American  population  was  not  evaluated  even  though  the 
frequency  varied  substantially  among  the  dif¬ 
ferent  African  American  populations  under  study  (Ball  et 
al.  1999;  Walker  et  al.  1998;  Paris  et  al.  1999).  In  this 
study,  we  have  provided  evidence  for  population  stratifi¬ 
cation  within  the  African  American  population  from 
Washington,  DC,  by  performing  tests  of  association  with 
10  unlinked  autosomal  genetic  markers.  These  analyses 
revealed  that  three  of  the  10  marker  alleles  are  also  signif¬ 
icantly  associated  with  prostate  cancer.  Reich  and  Gold¬ 
stein  (2001)  suggest  that  the  unlinked  markers  should  be 
matched  to  the  candidate  locus  based  on  allele  frequency. 
However,  we  have  used  a  panel  of  10  unlinked  genetic 
markers  similar  to  the  candidate  gene  locus  in  terms  of 
large  6-values  (>40%)  between  Africans  and  Europeans. 
The  panel  of  unlinked  genetic  markers  provides  a  more 
powerful  tool  for  detecting  genetic  substructure  (stratifi¬ 
cation)  within  the  African  American  population  than  ran¬ 
domly  chosen  markers  with  allele  frequencies  similar  to 
the  candidate  locus  (Pfaff  et  al.  2002;  Reich  and  Goldstein 
2002).  The  substructure  detected  is  attributable  to  differ¬ 
ences  in  ancestral  proportions  between  the  prostate  cancer 
cases  and  clinically  evaluated  controls.  It  is  likely  that 
stratification  in  the  African  American  samples  may  have 
resulted  in  a  spurious  association  of  CYP3A4-V with  pros¬ 
tate  cancer  in  our  samples.  If  so,  the  confounding  would 
have  to  be  strong  because,  even  after  taking  into  account 
self-reported  ethnicity  by  using  the  Cochran-Mantel-Haens- 
zel  analysis,  the  association  with  the  CYP3A4  genotype 
in  the  African  American  population  is  still  quite  significant. 
This  may  reflect  the  inherit  problem  of  “cryptic”  stratifica¬ 
tion  when  using  self-reported  “race”  or  ethnicity  in  group¬ 
ing  individuals  for  genetic  epidemiological  studies. 

This  is  the  first  study  that,  while  examining  the  role 
that  a  genetic  variant  plays  in  the  etiology  of  prostate  can¬ 
cer  in  African  Americans,  also  assesses  and  corrects  for 
population  stratification  within  the  population.  In  order  to 
determine  the  roles  that  genes  such  as  CYP3A4-V  play  in 
the  etiology  of  prostate  cancer  among  African  Americans, 
methods  that  deal  with  the  issue  of  stratification  are  im¬ 
portant  because  of  differences  in  allele  frequencies  and 
disease  prevalence  among  populations.  In  terms  of 
CyP5/l  4- F  specifically,  more  research  into  the  functional 
consequence  of  the  CYP3A4  promoter  variant  is  needed. 
It  is  possible  that  the  polymorphism  leads  to  an  altered 
form  of  the  transcriptional  regulatory  element  (NSFE) 
possibly  affecting  gene  expression  that  ultimately  may  re¬ 
sults  in  the  decreased  oxidation  of  testosterone.  Although 
some  research  supports  altered  function,  there  is  no  con¬ 
sensus  (Westlind  et  al.  1999;  Amirimani  et  al.  1999;  Ando 
et  al.  1999;  Ball  et  al.  1999;  Rebbeck  2000),  and  we  have 
not  found  strong  supportive  evidence  in  our  association 
studies  of  the  European  American  and  Nigerian  clinical 
samples. 

The  results  for  AG  and  GG  genotype  among  European 
Americans  were  suggestive  before  correction  for  multiple 
tests  (OR=2.3;  P=0.02).  The  trend  was  intriguing  and 
warranted  a  test  for  stratification  in  the  European  Ameri¬ 
can  samples  but,  because  of  limited  genomic  DNA,  we 


were  unable  to  genotype  all  of  the  European  American 
samples.  We  were  however  able  to  type  the  unlinked 
markers  in  individuals  with  prostate  cancer  who  possessed 
at  least  one  copy  of  the  CYP3A4-V  allele  («=54).  Surpris¬ 
ingly,  the  results  suggested  that  1 5%  of  these  individuals 
also  possessed  at  least  one  copy  of  the  African-associated 
alleles  at  the  FT  and  GC  marker  loci.  This  being  the  case, 
these  individuals  probably  have  considerable  African  an¬ 
cestry  but  classify  themselves  as  European  Americans. 

These  interesting  results  need  to  be  confirmed  by  typ¬ 
ing  a  larger  number  of  unlinked  markers  with  high  6s.  As 
many  of  these  markers  are  widely  becoming  available, 
they  will  improve  the  power  to  detect  and  control  for  pop¬ 
ulation  stratification  (Parra  et  al.  1998,  2001;  Pfaff  et  al. 
2001,  2002;  Pritchard  and  Rosenberg  1999).  In  addition, 
these  markers  can  be  utilized  for  genetic  approaches,  such 
as  mapping  by  admixture  linkage  disequilibrium,  a  poten¬ 
tially  useful  application  for  the  identification  of  genes 
contributing  to  complex  genetic  diseases  (Briscoe  et  al. 
1994;  Stephens  et  al.  1994;  Smith  et  al.  1996;  Zheng  and 
Elston  1999;  Pfaff  et  al.  2001,  2002;  Collins-Schramm  et 
al.  2002).  Increasing  the  number  of  unlinked  genetic  mark¬ 
ers  that  are  informative  for  ancestry  should  also  allow  us 
to  estimate  individual  admixture  in  the  Afncan  American 
population.  An  interesting  and  potentially  powerful  uti¬ 
lization  of  the  individual  admixture  estimates  would  be  as 
independent  risk  factors  on  which  cases  and  controls  can 
be  matched  for  analysis  (Williams  et  al.  2000).  Subse¬ 
quently,  stratification  would  be  minimized,  and  relation¬ 
ships  between  prostate  cancer  and  candidate  loci  would  be 
better  evaluated.  We  intend  to  explore  this  relationship  in 
a  larger  dataset  in  the  future. 

Our  inability  to  detect  an  association  among  the  Nige¬ 
rian  samples  may  be  attributable  to  several  reasons.  The 
first  is  the  high  frequency  of  the  variant  in  the  population. 
The  high  frequency  of  the  CYP3A4-V  allele  among  Nige¬ 
rians  possibly  contributed  to  the  difficulty  in  observing 
possible  low  penetrant  effects  in  the  Nigerian  population. 
A  much  larger  sample  size  may  be  needed  in  order  to  de¬ 
tect  a  genetic  effect.  Another  explanation  may  be  that  spe¬ 
cific  genotype-environment  interactions  are  absent  or  of 
low  effect  in  rural  Nigeria  because  of  diet  and  lifestyle 
differences  from  those  of  Afncan  Americans.  Even  though 
population  stratification  may  have  contributed  to  the  asso¬ 
ciation  with  prostate  cancer  in  the  African  American  sam¬ 
ples,  these  scenarios  cannot  be  ruled  out  as  likely  expla¬ 
nations  for  the  lack  of  association  within  the  Nigerian  and 
European  American  samples. 

Future  studies  may  provide  greater  knowledge  of  the 
role,  if  any,  that  the  CYP3A4  gene  plays  in  prostate  cancer 
development  and  progression.  However,  other  genes  or 
genomic  regions  have  been  identified  that  may  contribute 
to  the  susceptibility  to  prostate  cancer,  such  as  the  highly 
penetrant  but  less  common  alleles  at  the  HPCl  (Smith  et 
al.  1996),  HPC2  (Tavtigan  et  al.  2001)  and  HPCX  (Xu  et 
al.  1998)  loci  and  the  common  low  penetrant  alleles  at  the 
androgen  receptor  (Giovannucci  et  al.  1997)  SRD5AR 
(Jaffe  et  al.  2000)  and  CYP17  (Habuchi  et  al.  2000;  Gsur 
et  al.  2000;  Kittles  et  al.  2001b)  genes. 
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